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Introduction 

^  Antiestrogens  have  been  successfully  used  in  the  management  of  breast  cancer  since  the  first 
clinical  trial  of  Tamoxifen  (TAM)  in  1971  (3).  TAM  produces  a  significant  increase  in  both 
overall  and  recurrence-free  survival  but  resistance  almost  inevitably  arises  in  most  patients  (5,6). 
We  hypothesize  that  one  form  of  acquired  antiestrogen  resistance  reflects  the  altered  expression 
of  what  were  previously  estrogen-regulated  genes.  We  further  hypothesize  that  only  a  subset  of 
all  estrogen  (E2)-regulated  genes,  those  comprising  a  specific  gene  network,  is  responsible  for 
the  resistance  phenotype.  Since  TAM  (triphenylethylene)  and  ICI  182,780  (steroidal)  induce 
different  ER  conformations,  we  also  hypothesize  that  the  consequent  patterns  of  gene  regulation 
will  be  different  and  dictate  the  presence/absence  of  crossresistance  among  antiestrogens. 

To  address  these  hypotheses,  we  have  generated  novel  E2-independent  and  antiestrogen 
resistant  variants  of  the  E2-dependent,  MCF-7  human  breast  cancer  cell  line  (MCF7/MIII, 
MCF7/LCC1,  MCF7/LCC2,  MCF-7/LCC9)  -  recently  reviewed  in  (1).  We  also  have  assembled  a 
panel  of  additional  resistant  cells  from  within  this  institution  and  from  other  investigators.  These 
include  additional  antiestrogen  resistant  MCF-7  variants  (LY2,  R27,  R3,  MCF-7RR),  all  of 
which  express  ER,  and  the  ER-negative  ZR-75-1  (ZR75/LCC3,  ZR-75-9al)  and  T47D  (T47Dco) 
variants.  Other  resistance  models  are  currently  being  obtained  from  other  laboratories  or  being 
generated  by  selection  in  vivo  selection  against  TAM  in  athymic  nude  rats  (rats  and  humans 
perceive  TAM  as  a  partial  agonist,  mice  perceive  TAM  as  a  pure  agonist). 

This  is  an  Idea  Award  to  study  the  genes  and  patterns  of  genes  expressed  in  acquired 
antiestrogen  resistance  in  cell  culture  models.  The  PI  will  apply  new,  state-of-the-art  technologies 
to  identify  key  endocrine-regulated  molecular  pathways  to  apoptosis/proliferation.  By  identifying 
key  components  of  these  pathways,  we  may  be  able  to  predict  response  to  first-line  and  crossover 
antiestrogenic  therapies,  and/or  provide  novel  therapeutic  strategies  for  antiestrogen  resistant 
tumors. 


Antiestrogen  Resistance.  Most  breast  tumors  that  initially  respond  to  TAM  recur  and 
require  other  endocrine  or  cytotoxic  therapies  (6).  Despite  over  10  million  patient  years  of 
experience  with  TAM,  the  precise  mechanisms  that  confer  acquired  resistance  are  unknown  (1). 
Absence  of  ER  expression  is  clearly  important  for  de  novo  resistance  (1).  ER  expression  is  not 
lost  in  most  breast  tumors  that  acquire  antiestrogen  resistance  (9).  Currently,  there  is  little 
compelling  evidence  that  expression  of  ER  splice  variants  and  mutant  ER  contribute  significantly 
to  antiestrogen  resistance  in  patients  (1,10).  While  the  importance  of  wild  type  ERa  is  established 
as  a  mediator/predictor  of  antiestrogen  responsiveness,  that  of  ERP  remains  unclear.  ERa  may  be 
the  predominant  species  in  most  ER-i-  breast  tumors  (11,13),  and  is  associated  with  a  better 
prognosis  (7).  ERp  is  associated  with  a  poorer  prognosis,  absence  of  PgR,  and  lymph  node 
involvement  (4,13).  One  small  study  reported  higher  ERp  mRNA  levels  in  resistant  tumors  (12). 
However,  this  association  could  not  be  separated  from  that  between  ERP  and  a  more  aggressive 
phenotype  (4,13).  Some  studies  report  activities  independent  of  ER  function,  which  may  initiate 
events  that  are  necessary  but  not  sufficient  for  antiestrogen-induced  effects  (1).  Our  research 
team  has  recently  reviewed  in  detail  the  potential  mechanisms  of  antiestrogen  resistance  (2). 
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BODY  OF  REPORT 

Our  purpose  is  to  evaluate  a  series  of  antiestrogen  responsive  and  resistant  breast  cancer  cell  lines 
for  their  patterns  of  gene  expression.  We  will  explore  these  data  using  state-of-the-art  pattern 
analysis  and  statistically-based  methods  that  apply  both  statistical  and  information  theory.  We 
also  will  apply  the  more  computationally  simplistic  methods  used  by  others  in  the  field. 

In  a  prior  report,  we  made  one  change  to  the  specific  aims  and  Statement  of  Work.  Our 
collaborations  with  Dr.  Wang's  group  at  Catholic  University  of  America  (now  at  Virginia  Tech  - 
Dr.  Wang  recently  moved  to  VA  Tech’s  Alexandria  Research  Institute),  in  which  we  have  begun 
to  develop  and  test  several  new  algorithms  for  mining  the  hi^  dimensional  data  sets  produced  by 
gene  expression  microarray  analyses,  continue  throughout  this  award. 

Specific  Aims  (unchanged) 

Specific  Aim  1:  use  gene  microarrays  to  identify  differentially  expressed  genes  in  a  panel  of 
breast  cancer  cell  lines. 

Specific  Aim  2:  explore  the  data  from  Aim  1  to  identify  those  differentially  expressed  gene 
clusters  most  closely  associated  with  acquired  antiestrogen  resistance  and  test  further  novel 
algorithms  for  the  analysis  of  gene  expression  microarray  data. 

Specific  Aim  3:  begin  to  assess  the  likely  functional  relevance  of  representative  members  of 
these  clusters  and  study  their  expression  in  human  breast  cancer  biopsies. 

Long  term  aims:  establish  a  pattem(s)  of  gene  clusters  that  can  predict  antiestrogen  responses  in 
patients.  This  could  lead  to  a  more  effective  identification  of  candidates  for  specific  antiestrogen 
therapies  and  identify  those  patients  least  likely  to  respond  and  who  may  benefit  from  an  early 
initiation  of  cytotoxic  chemotherapy. 


Key  Research  Accomplishments 


TASK  1:  Use  gene  microarrays  to  identify  differentially  expressed  genes  in  a  panel  of  breast 
cancer  cell  lines. 

We  have  completed  this  aim  with  the  possible  exception  of  arraying  RNA  against  the  new 
Affymetrix  GeneChip  that  contains  the  entire  human  genome  (released  in  Oct,  2003).  For  the 
cDNA  arrays  (Research  Genetics),  we  have  arrayed  and  individually  aligned  all  of  the  digitized 
images.  We  used  Pathways  vs.  4.0  and  independently  aligned  each  of  the  ~4,000  spots/array;  this 
is  very  time  consuming  but  provides  much  higher  quality  data  than  using  only  the  software  to 
align  automatically  each  spot.  We  have  also  generated  some  simple  algorithms  to  more 
effectively  assess  bleeding  effects  -  a  problem  with  these  radiolabeled  probes  where  the  signal 
from  an  abundant  mRNA  bleeds  into  an  adjacent  signal.  This  approach  will  be  included  in  a 
manuscript  currently  in  preparation  and  is  briefly  described  below. 

Data  Preprocessing:  Pathways™  4.0  software  algorithms  (Research  Genetics,  Inc.)  corrected  the 
local  nonspecific  binding  of  the  probe  to  filter  for  each  spot  (background  correction).  Intensities 
of  both  local  background  and  spot  for  each  gene  were  measured  geometrically.  Each  gene’s 
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average  intensity  was  measured  specifically  within  a  circle  target  that  is  75%  of  the  circle 
enclosing  the  gene  spot.  Additionally,  from  125%  outside  the  enclosed  circle  to  150%  outside  the 
circle  (in  square  form),  the  average  local  background  intensity  for  each  gene  was  measured. 

Radioactive  signal  bleeding  from  neighboring  cDNA  spots  is  a  major  confounding  factor 
in  this  microarray  platform.  Consequently,  we  generated  a  simple  algorithm  to  detect 
compromised  signals.  For  each  gene  on  the  nylon  filter,  bleeding  effects  were  approximated  by 
calculating  the  difference  between  the  local  and  global  background  signals.  Global  background 
was  estimated  by  taking  the  mean  of  the  lowest  20%  of  all  local  background  intensities  from 
cDNA-free  regions  on  the  nylon  filter.  Genes  were  called  as  being  “very  low/not  detected”  if 
their  raw  intensity  signals  were  within  three  standard  deviations  of  the  global  background  mean. 

The  difference  between  local  and  global  background  was  calculated  as  a  percentage  of  the 
raw  intensity  value  of  that  gene.  A  “percentage-above”  threshold  was  constructed  to  indicate  that 
the  radioactivity  from  neighboring  spots  had  bled  into  the  spot  of  interest,  whereas  a  “percentage- 
below  threshold”  was  used  to  indicate  that  the  bleeding  effects  were  negligible.  Invalid  genes 
were  eliminated  from  the  analyses. 

Because  the  calculated  estimate  of  the  bleeding  effect  was  empirically  derived,  a  range  of 
threshold  values  was  inspected  and  statistical  analyses  were  re-evaluated  on  the  data  sets  with  the 
varying  threshold  values  used  to  assess  likely  radioactive  bleeding.  A  threshold  range  of  21%- 
40%  was  necessary  to  objectively  tag  and  eliminate  affected  genes  by  considering  the  bleeding 
effects  of  1)  the  local  spot,  2)  the  neighboring  spots  into  the  local  background  area,  and  3) 
nonspecific  local  background  hybridization.  In  addition,  for  genes  of  interest,  manual  spot 
visualization  was  performed  to  assess  further  the  bleeding  effect  estimate. 


TASK  2:  Explore  the  data  from  Aim  1  to  identify  those  differentially  expressed  gene  clusters 
most  closely  associated  with  acquired  antiestrogen  resistance. 

We  have  essentially  competed  this  Task,  using  both  the  Clontech  and  Research  Genetics 
platforms.  Our  previous  report  included  tables  of  the  data  from  the  Clontech  arrays.  We  now 
include  a  Table  of  genes  from  the  Research  Genetics  platform.  We  would  not  expect  to  find  the 
same  genes  since  there  are  many  genes  on  the  ResGen  filters  that  are  not  on  the  Clontech  filters. 
Furthermore,  since  the  probes  are  prepared  and  labeled  differently,  the  signal  scale  for  each  gene 
is  different  and,  therefore,  the  ability  to  identify  differential  expression  is  not  identical. 

LCCl  (n=3;  responsive),  LCC2  (n=3;  resistant),  LCC9  (n=3;  resistant),  LY2  (n=3; 
resistant),  and  RR  (n=3;  resistant)  are  estrogen  independent  MCF-7  breast  cancer  cell  line 
variants  that  are  either  resistant  or  responsive  to  known  Selective  Estrogen  Receptor  Modulators. 
These  cell  lines  were  arrayed  against  Research  Genetics  GF211  (NamedGenes’''^  nylon  filters 
probing  radiolabeled-^^P  cDNA  targets.  A  Molecular  Dynamics  Storm  Phosphorimager  was  used 
to  scan  the  radiolabeled  filters,  and  Resgen  Pathways™  4.0  imaging  software  measured  the 
densitometric  readings  from  the  digital  scans. 

In-depth  analyses/mining  were  done  on  Mathworks  MATLAB™  with  established  and 
developing  algorithms.  After  excluding  low  threshold  signals  (<0.1  in  each  group)  and  bleeding 
effects  contributed  by  the  neighboring  spots  (as  described  above),  we  initially  filtered  the  4,324 
dimensional  array  to  1,882  genes.  At  the  top  level,  we  globally  visualized  the  1,882-dimensional 
data  from  the  breast  cancer  variants  using  Principal  Component  Analyses  (PCA).  A  nonlinear 
separation  is  seen  in  this  projection  between  the  antiestrogen  resistant  and  responsive  groups, 
suggesting  that  phenotype  separation  is  possible  and  more  samples  are  needed  to  clearly  define 
the  boundary  between  the  two  antiestrogen  resistant/responsive  breast  cell  models  (this  is  data 
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visualization,  not  data  analysis)  in  this  full  dimensional  data  space. 

Dimensionality  reduction  by  gene  selection  was  necessary  to  identify  an  accurate  and 
robust  discriminant  gene  expression  profile.  We  used  a  novel  profile  selection  algorithm  that 
filters  the  1882-D  data  set  by  the  highest  signal-to-noise  ratios,  and  then  eliminates  genes  by  both 
assessing  their  contribution  to  the  profile’s  strength  and  by  maximizing  the  trace  of  a  weighted 
Fisher’s  scatter  matrix.  We  applied  the  algorithm  to  the  two  different  antiestrogen 
resistant/responsive  phenotypes  from  the  MCF-7  variants  using  either  all  samples  or  random 
multiple  subsets  of  samples.  From  using  all  the  of  estrogen  independent  MCF-7  variant  cell 
lines,  the  algorithm  identified  a  20  gene  subset  that  discriminates  accurately  between  the  two 
phenotypes.  This  20-dimensional  data  set  was  projected  into  3-dimensional  space  with  both  PCA 
and  our  new  Discriminant  Component  Analysis  (DCA)  (14).  The  antiestrogen 
resistant/responsive  phenotypes  are  now  linearly  separable  in  both  visualizations  (see  Figure 
below). 

We  iteratively  tested  a  neural  network  Multi-Layer  Perceptron  composed  of  three  hidden 
nodes  in  one  hidden  layer.  Four  random  samples  from  the  antiestrogen  resistant  samples  were 
chosen  as  independent  data  sets,  and  the  remaining  samples  were  used  to  train  a  neural  network 
to  predict  the  class  of  each  independent  sample  (trained  using  the  20  genes  selected  above). 
After  100  cycles,  the  MLP  predicted  the  independent  samples  with  an  overall  accuracy  of  92.5%, 
as  a  proof  of  principle  that  antiestrogen  resistance  can  be  adequately  predicted. 


UNIGENE  ID 

Gene  Description 

RESPONSIVE  /  RESISTANT  / 

Student  t-test 

RESISTANT 

RESPONSIVE  (equal) 

Hs.77858 

mesenchyme  homeo  box  2  (growth  arrest- 
specific  homeo  box) 

2 

0.50 

5.06037E-05 

Hs.77515 

inositol  1,4,5-triphosphate  receptor,  type  3 
myeloid/lymphoid  or  mixed-lineage  leukemia 

2 

0.50 

0.000504835 

Hs.404 

(trithorax  (Drosophila)  homolog);  translocated 
to,  3 

eukaryotic  translation  initiation  factor  2, 

2 

0.50 

0.000328254 

Hs.  15 1777 

4 

0.25 

0.000748487 

subunit  1  (alpha,  35kD ) 

Hs.271980 

mitogen-activated  protein  kinase  6 

2 

0.50 

0.000633155 

Hs.  1722 10 

MUFl  protein 

6 

0.20 

0.000361767 

Hs.1420 

fibroblast  growth  factor  receptor  3 
(achondroplasia,  thanatophoric  dwarfism) 

2 

0.50 

0.000383285 

Hs.26014 

activin  A  receptor,  type  II 

3 

0.33 

1.02059E-05 

Hs.349092 

ESTs,  Weakly  similar  to  138022  hypothetical 
protein  [H.sapiens] 

4 

0.25 

6.81344E-05 

Hs.75447 

ralA  binding  protein  1 

4 

0.25 

0.000316546 

Hs.74294 

aldehyde  dehydrogenase  7  family,  member  A1 

2 

0.50 

0.000100641 

Hs.82002 

endothelin  receptor  type  B 

3 

0.33 

0.000569089 

Hs.4082 

lectin,  galactoside-binding,  soluble,  8  (galectin 
8) 

5 

0.20 

1.83147E-05 

Hs.4187 

hypothetical  protein  24636 

5 

0.20 

4.789E-07 

Hs.75564 

CD151  antigen 

3 

0.33 

0.000340471 

Hs.77613 

ataxia  telangiectasia  and  Rad3  related 

2 

0.50 

0.000390206 

Hs.74101 

spleen  tyrosine  kinase 

2 

0.50 

0.000482157 

Hs.89691 

UDP  glycosyltransferase  2  family,  polypeptide 
B4 

2 

0.50 

0.001938969 

Hs.48876 

famesyl-diphosphate  famesyltransferase  1 

2 

0.50 

0.00018038 

Hs.98074 

itchy  (mouse  homolog)  E3  ubiquitin  protein 
ligase 

3 

0.33 

0.000154619 
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These  observations  strongly  implicate  the  20  genes  as  being  both  differentially  expressed 
and  discriminatory  between  antiestrogen  sensitive  and  resistant  cells.  The  data  also  suggest  that 
some  of  these  genes  may  be  functionally  involved  in  conferring  acquired  antiestrogen  resistance 
in  breast  cancer  cells.  Further  analysis  of  these  genes  and  the  entire  data  set  continue  as  we  apply 
additional  methods  to  mine  the  data. 

In  our  last  report  we  included  our  new  normalization  publication  {IEEE  Trans  InfTechnol 
Biomed,  6:  29-37,  2002).  We  now  include  a  publication  on  our  novel  "block  principal 
components  analysis"  method  for  exploring  gene  expression  microarray  data.  A  reprint  is 
included  in  the  appendix. 

We  have  also  derived  a  new  method  for  multidimensional  scaling  called  discriminant 
components  analysis.  Rather  than  focusing  on  capturing  data  variance,  as  is  done  using  principal 
component  analysis,  this  new  method  maximizes  the  discriminating  components  in  the  data.  The 
manuscript  has  been  accepted  for  publication  in  the  informatics  literature  {Journal  of  Signal 
Processing  Systems)  and  a  preprint  is  included  in  the  appendix. 

TASK  3:  Begin  to  assess  the  likely  functional  relevance 
of  representative  members  of  these  clusters  and  study 
their  expression  in  human  breast  cancer  biopsies. 

To  maintain  focus  within  this  application,  we  have  limited 
our  initial  studies  to  NFkB  and  IRF-1.  Our  intention  is  to 
obtain  sufficient  preliminary  data  to  support  an  ROl  or 
DOD  application  focused  on  these  two  genes  and  their 
interactions  in  antiestrogen  resistance.  We  have  continued 
to  study  the  role  of  our  dominant  negative  interferon 
regulatory  factor- 1  (dnIRF-1).  We  have  now  made 
excellent  progress  on  this  aspect  of  the  study  and  have 
submitted  a  manuscript  showing  the  ability  of  dnIRF-1  to 
block  the  proapoptotic  effects  of  ICI  182,780  but  not  the 
cell  cycle  arrest  effects  of  the  antiestrogen.  Our  ability  to 
separate  these  two  components  of  sensitivity  to 
antiestrogens  has  several  important  implications.  For 
example,  selectively  increasing  the  proapoptotic  effects  of 
antiestrogens  may  be  an  effective  means  to  improve  their  ability  to  increase  overall  survival  in 
patients  because  this  should  increase  the  proportion  of  cells  undergoing  apoptotic  cell  death. 
Cells  that  are  only  growth  arrested  may  survive  and  thereby  have  more  opportunities  to  adapt, 
acquire  resistance,  and  generate  subsequent  disease  recurrence.  A  copy  of  the  manuscript  will  be 
included  with  the  final  report. 

We  have  also  made  good  progress  in  our  initial  studies  on  NFkB.  In  a  collaboration  with 
Dr.  Christine  Pratt  at  the  University  of  Manitoba,  we  now  implicate  NFkB  in  estrogen- 
independence.  These  data  are  consistent  with  the  increased  sensitivity  of  our  antiestrogen 
resistant  cells  to  the  natural  inhibitor  of  NFkB  (parthenolide).  Our  data  with  parthenolide  were 
included  in  the  paper  by  Gu  et  al.  that  we  included  with  last  years  report  (8).  The  study  with  Dr. 
Pratt’s  laboratory  is  included  with  this  report  (see  appendix). 


DCA  projection  of  20-D  data  set  showing 
linear  separability  of  antiestrogen  sensitive 
and  resistant  profiles.  o=LCCl  (sensitive; 
n=3);  A=LY2  (n=3;  resistant);  T=LCC9 
(n=3;  resistant);  ◄=LCC2  (n=3;  resistant); 
►  =RR  (n=3;  resistant);  Total  variance 
covered  by  top  3PC:  75% 
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.  Key  Research  Accomplishments  (bulleted) 

•  Completed  and  published  manuscript  describing  data  implicating  NFkB  in  estrogen 
independence. 

•  Completed  microarray  data  analysis  of  cell  lines. 

•  Built  an  accurate  neural  predictor  of  antiestrogen  responsiveness  based  on  the  microarray 
data  collected  above. 

•  Completed  and  published  (in  press)  a  new  algorithm  for  microarray  data  visualization 
based  upon  maximizing  the  discriminant  data  among  experimental  groups. 

•  Completed  and  submitted  our  initial  studies  of  the  role  of  IRF-1  in  ICI  1 82,780  mediated 
cell  signaling. 


Reportable  Outcomes 

Reportable  outcomes  are  presented  as  manuscripts  and  abstracts. 


Manuscripts  and  Abstracts 

We  have  published  several  studies  directly  related  to  the  funded  work,  including  a  major  review 
in  the  journal  Oncogene. 


Manuscripts 

1.  Pratt,  M.A.C.,  Bishop,  T.E.,  White,  D.,  Yasvinski,  G.,  Menard,  M.,  Niu,  M.Y.  &  Clarke.  R. 
“Estrogen  withdrawal-induced  NF-kB  and  Bcl-3  expression  in  breast  cancer  cells:  roles  in 
growth  and  hormone  independence.  ”  Mol  Cell  Biol,  23:  6887-6900, 2003. 

2.  Wang,  Z.,  Zhang,  J.,  Lu,  J.,  Lee,  R.,  Kung,  S.-Y.,  Clarke  R.  &.  Wang  Y.  Discriminatory  mining 
of  gene  expression  microarray  data.  J  Signal  Process  Systems,  in  press. 

3.  Clarke.  R..  Liu,  M.C.,  Bouker,  K.B.,  Gu,  Z.,  Lee,  R.Y.,  Zhu,  Y.,  Skaar,  T.C.,  Gomez,  B., 
O'Brien,  K.,  Wang,  Y.,  Hilakivi-Clarke,  L.A.  “Antiestrogen  resistance  in  breast  cancer  and  the 
role  of  estrogen  receptor  signaling.”  Oncogene,  22:  7316-7339, 2003. 

4.  Welch,  J.N.  &  Clarke.  R.  "ErbB-2  expression  and  drug  resistance  in  cancer."  Signal,  3:  4-9, 
2002.  (review  -  this  was  presented  as  being  in  press  in  the  prior  report). 

Reprints  of  papers  #1-3  and  a  preprint  of  #4  are  included  in  the  appendix. 


Abstracts 

1.  Zwart,  A.,  Lee,  R.Y.,  Zhang,  J.,  Wang,  J.,  Wang,  Y.  &  Clarke.  R.  "mRNA  profiles  from 
MCF-7  variants  are  used  to  predict  antiestrogen  resistance/responsive  phenotypes."  Proc  85th 
Annual  Meeting  Endocrine  Society  146, 2003. 
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Conclusions 

We  have  made  excellent  progress  in  our  studies  on  the  molecular  characterization  of  antiestrogen 
resistance,  which  is  evident  in  our  productivity  as  measured  by  publications  and  new  preliminary 
data.  The  study  is  on-track  and  the  amount  of  data  accumulating  is  considerable.  Several  new 
algorithms  under  development  are  showing  good  performance  in  our  initial  analyses.  Our  data 
with  NFkB,  IRF-1,  and  the  dnIRF-1  are  encouraging  and  suggest  we  are  on  the  right  track  to 
identifying  new  signal  transduction  pathways  associated  with  acquired  antiestrogen  resistance. 
For  example,  these  data  show  that  resistant  cells  are  more  sensitive  to  inhibition  of  NFkB. 
Overexpression  of  IRF-1,  which  is  suppressed  by  estrogens  and  induced  by  antiestrogens,  is 
associated  with  reduced  cell  proliferation,  and  the  dnIRF-1  data  indicate  that  ICI  182,780  signals 
to  apoptosis  primarily  through  IRF-1 . 
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About  one-third  of  breast  cancers  express  a  functional  estrogen  (p-estradiol  [E2])  receptor  (ER)  and  are 
initially  dependent  on  E2  for  growth  and  survival  but  eventually  progress  to  hormone  independence.  We  show 
here  that  ER**",  E2-independent  MCF-7/LCC1  cells  derived  from  E2-dependent  MCF-7  cells  contain  elevated 
basal  NF-kB  activity  and  elevated  expression  of  the  transcriptional  coactivator  Bcl-3  compared  with  the 
parental  MCF-7  line.  LCCl  NF-kB  activity  consists  primarily  of  p50  dimers,  although  low  levels  of  a  p65/p50 
complex  are  also  present.  The  ER~  breast  cancer  cell  lines  harbor  abundant  levels  of  both  NF-kB  complexes. 

In  contrast,  nuclear  extracts  from  MCF-7  cells  contain  a  significantly  lower  level  of  p50  and  p65  than  do  LCCl 
cells.  Estrogen  withdrawal  increases  both  NF-kB  DNA  binding  activity  and  expression  of  Bcl-3  in  MCF-7  and 
LCCl  cells  in  vitro  and  in  vivo.  Tumors  derived  from  MCF-7  cells  ectopically  expressing  Bcl-3  remain  E2 
dependent  but  display  a  markedly  higher  tumor  establishment  and  growth  rate  compared  to  controls.  Expres¬ 
sion  of  a  stable  form  of  iKBa  in  LCCl  cells  severely  reduced  nuclear  expression  of  p65  and  the  p65/p50  DNA 
binding  heterodimer.  Whereas  LCCl  tumors  in  nude  mice  were  stable  or  grew,  LCCl(lKBa)  tumors  regressed 
after  E2  withdrawal.  Thus,  both  p50/BcI-3-  and  p65/p50-associated  NF-kB  activities  are  activated  early  in 
progression  and  serve  differential  roles  in  growth  and  hormone  independence,  respectively.  We  propose  that  E2 
withdrawal  may  initiate  selection  for  hormone  independence  in  breast  cancer  cells  by  activation  of  NF-kB  and 
Bcl-3,  which  could  then  supplant  E2  by  providing  both  survival  and  growth  signals. 


About  60%  of  all  diagnosed  breast  cancers  express  estrogen 
receptors  (ERs),  and  about  half  of  these  are  dependent  on 
estrogen  for  growth  and  are  initially  responsive  to  endocrine 
therapy  (15,  25,  48).  These  tumors  eventually  acquire  resis¬ 
tance  to  hormonal  manipulation  as  part  of  their  progression 
toward  a  more  malignant  phenotype,  and  in  many  instances 
they  cease  to  express  ERs  or  express  mutant  forms  of  the  ER 
(33,  34).  The  MCF-7  line  is  a  widely  used  prototype  for  estro¬ 
gen-dependent  breast  cancer.  These  cells  form  tumors  in  nude 
mice  in  the  presence  of  circulating  p-estradiol  (E2),  and  the 
tumors  regress  rapidly  through  an  apoptotic  mechanism  (21) 
when  the  source  of  E2  is  removed  (29,  44).  In  order  to  study 
the  progression  of  breast  cancer  toward  a  hormone-indepen- 
dent  phenotype,  sublines  derived  from  MCF-7  cells  cultured  in 
vivo  and  in  vitro  in  the  presence  of  subphysiological  concen¬ 
trations  of  estrogen  have  been  isolated  (13,  14).  MCF-7/MIII 
cells  were  isolated  from  a  small,  slowly  proliferating  MCF-7 
tumor  that  arose  in  an  ovariectomized  athymic  mouse,  and  a 
second  passage  produced  MCF-7/LCC1  cells,  which  form  E2- 
independent  tumors  with  a  significantly  reduced  latency.  Both 
cell  lines  retain  the  parental  MCF-7  level  of  expression  of  the 
ER  but  display  increased  expression  of  some  estrogen-regu¬ 
lated  genes  with  a  concomitant  loss  of  E2  responsiveness  in 
vitro.  Although  LCCl  cells  can  efficiently  generate  tumors  in 
nude  mice  in  the  absence  of  estrogen,  they  grow  more  rapidly 


*  Corresponding  author.  Mailing  address:  Department  Cellular  and 
Molecular  Medicine,  University  of  Ottawa,  451  Smyth  Rd.,  Ottawa, 
Ontario,  Canada  KIH  8M5.  Phone:  (613)  562-5800,  ext.  8366.  Fax: 
(613)  562-5434.  E-mail:  cpratt@uottawa.ca. 


when  estrogen  is  present  and  therefore  retain  a  degree  of 
estrogen  responsiveness  in  vivo  (9). 

The  transcription  factor  NF-kB  is  composed  of  a  het¬ 
erodimer  of  members  of  the  Rel  family  of  transcription  factors, 
including  p50  (NF-kB1),  p65(RelA),  c-Rel,  RelB,  and  p52(NF- 
kB2).  Transactivation  domains  are  absent  in  p50  and  p52,  and 
thus  they  are  active  only  as  heterodimers  with  other  members. 
This  family  of  proteins  contains  Rel  homology  domains  which 
mediate  DNA  binding,  dimerization,  and  nuclear  localization. 
Activation  of  NF-kB  occurs  following  a  wide  variety  of  stimuli, 
including  exposure  to  some  cytokines  and  several  kinds  of 
stress.  Inactive  NF-kB  is  maintained  in  the  cytoplasm  as  a 
result  of  interaction  with  an  inhibitory  subunit,  IkB  (4),  of 
which  there  are  four  subtypes,  a,p,  y,  and  e  (31).  NF-kB  acti¬ 
vation  follows  phosphorylation  of  IkB  by  IkB  kinases  (a  or  3), 
which  in  turn  are  activated  by  an  NF-KB-inducing  kinase  called 
NIK  (24,  38).  IkB  phosphorylation  results  in  its  degradation 
and  subsequent  release,  allowing  NF-kB  translocation  to  the 
nucleus,  where  it  regulates  a  large  number  of  genes  involved  in 
inflammation,  immunity,  cell  adhesion,  and  apoptosis-regula¬ 
tory  molecules  (2).  Another  member  of  the  IkB  family  is  the 
oncoprotein  Bcl-3,  which  can  disrupt  the  association  between 
transcriptionally  inactive  p50  and  p52  homodimers,  allowing 
association  of  a  transactivating  partner.  Bcl-3  can  also  directly 
activate  transcriptional  function  in  these  complexes  (reference 
31  and  references  therein).  Much  of  the  information  regarding 
the  role  of  NF-kB  in  cell  survival  has  come  from  the  study  of 
tumor  necrosis  factor  alpha  signaling  in  tumor  cells.  While  the 
tumor  necrosis  factor  alpha  receptor  activates  a  caspase  cas¬ 
cade  leading  to  apoptosis,  in  most  cells  a  concomitant  activa- 
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tion  of  NF-kB  prevents  cell  death  (3,  5).  These  observations 
led  to  the  discovery  that  NF-kB  regulates  the  activity  of  several 
survival  genes,  including  genes  for  Bcl-x  and  inhibitors  of  ap¬ 
optosis  (lAPs)  (reviewed  in  reference  42). 

There  is  abundant  evidence  that  NF-kB  can  promote  tumor- 
igenesis  (32,  46).  One  of  the  earliest  reports  showed  that  an¬ 
tisense  downregulation  of  the  p65  subunit  of  NF-kB  in  fibro¬ 
sarcoma  cells  could  both  inhibit  tumorigenicity  and  cause 
tumor  regression  (22).  More  recently,  inhibition  of  NF-kB 
activity  by  stable  expression  of  a  dominant  negative  inhibitory 
IkB  kinase  in  mouse  mammary  tumor  cells  reduced  their  tu- 
morigenic  potential  (7).  Resistance  to  chemotherapeutic  drugs 
is  also  impaired  by  NF-kB  inhibition  in  tumor  cells  (32,  54). 
Studies  have  shown  that  breast  cancer  cell  lines  expressing  the 
ER  contain  low  levels  of  NF-kB  DNA  binding  activity,  while 
ER"  breast  cancer  cells  display  constitutively  high  levels  of 
NF-kB  DNA  binding  and  correspondingly  high  NF-kB  trans- 
activational  activity  (36,  50).  NF-kB  activity  is  also  induced  in 
rat  mammary  glands  after  treatment  with  carcinogens  and  ap¬ 
pears  to  increase  prior  to  malignant  transformation  of  mam¬ 
mary  epithelial  cells  (28). 

In  this  study  we  have  used  the  ER-positive,  hormone-inde¬ 
pendent  LCCl  and  MCF-7  parental  cells  to  determine  if  and  at 
which  stage  during  serial  breast  cancer  cell  progression  toward 
hormone  independence  these  cells  begin  to  acquire  elevated 
NF-kB  activity.  Using  a  tumor  xenograft  model,  we  show  that 
(i)  expression  of  the  Bcl-3  protein  in  MCF-7  cells  augments 
tumor  establishment  and  growth  but  is  insufficient  to  confer 
E2-independence  and  (ii)  inhibition  of  p65-associated  NF-kB 
activity  with  a  dominant  form  of  the  NF-kB  inhibitor  iKBa 
reverts  the  E2-independent  phenotype  of  LCCl  cells. 

MATERIALS  AND  METHODS 

Plasmids  and  antibodies.  Anti-p65  (A),  anti-p50  (H-119),  anti-p52  (K-27), 
anti-c-Rel  (N),  anti-Bax  (N-20),  and  anti-Bcl-3  were  obtained  from  Santa  Cruz 
Biotechnology.  Anti-Bcl-2  and  anti-Bcl-xL  were  gifts  from  John  Reed,  La  Jolla, 
Calif.  Anti-FLAG  M2  monoclonal  antibody  and  anti-a-actin  polyclonal  antibody 
were  purchased  from  Sigma.  The  3X-NF-KB3-luciferase  construct  containing 
three  copies  of  the  NF-kB  response  element  from  the  major  histocompatibility 
complex  class  I  gene  and  a  mutated  version  of  this  element  (19),  pRC/CMV- 
FLAG-tagged  IkBk  S32A/S36A,  the  superrepressor  form  of  kBa 
(27),  and  CMV-2-FLAG-BC1-3  were  provided  by  A.  S,  Baldwin.  Initial  studies 
utilized  the  major  histocompatibility  complex  reporter  gene,  and  the  k  light- 
chain  reporter  gene  was  obtained  later  since  it  produced  higher  overall  enzyme 
values  after  transfection. 

Cell  culture  and  transfection.  MCF-7  cells  were  derived  from  several  isolates. 
MCF-7(early)  and  MCF-7(late)  cells  are  uncloned  isolates  of  MCF-7  at  con¬ 
trolled  passages  of  46  to  48  and  157  to  159,  respectively.  MCF-7/MIII  and 
MCF-7/LCC1  are  ER"*",  E2-independent  cell  lines.  Mill  cells  were  isolated  from 
a  slow-growing  tumor  resulting  from  inoculation  of  parental  MCF-7  cells  into  an 
ovariectomized  nude  mouse.  Mill  cells  were  further  passaged  in  ovariectomized 
nude  mice  and  then  reestablished  in  vitro  as  the  continuous  line  MCF-7/LCC1, 
which  forms  tumors  in  ovariectomized  mice  with  reduced  latency  (9).  Both 
sublines  were  passaged  fewer  than  30  times  after  isolation.  Other  MCF-7  cells, 
not  designated  early  or  late  passage,  were  obtained  originally  from  L.  Murphy 
(Winnipeg,  Canada)  and  are  of  undetermined  passage.  MDA-MB-231  (ER"), 
MDA-MB-468  (ER”)  and  T47-D  (ER"^)  cells  were  obtained  from  the  American 
Type  Culture  Collection  (Manassas,  Va.).  The  tumorigenic  characteristics  of 
many  of  these  breast  cancer  cell  lines  have  been  documented  (45,  49).  MCF- 
7(40F)  is  an  MCF-7  derivative  selected  for  resistance  to  adriamycin  that  is  E2 
independent  and  ER"  (20).  All  MCF-7  derived  cells  were  maintained  in  Dul- 
becco’s  modified  Eagle’s  medium  (DMEM)  (GIBCO-BRL)  containing  a  high 
glucose  concentration,  5%  fetal  bovine  serum  (GIBCO-BRL),  and  2  p,g  of 
gentamicin  sulfate  per  ml.  T47-D,  MDA-MB-468,  and  MDA-MB-231  cells  were 
maintained  in  DMEM  containing  5%  serum  and  a  low  glucose  concentration. 


SKBR-3  cells  were  maintained  in  McCoy’s  5A  medium  with  10%  fetal  bovine 
serum.  Incubation  was  at  3TC  in  a  5%  CO2  humidified  environment.  In  exper¬ 
iments  requiring  E2  depletion,  cells  were  precultured  for  7  days  with  several 
changes  of  phenol  red-free  DMEM  containing  5%  steroid-free  fetal  bovine 
serum  that  had  been  adsorbed  to  dextran-coated  charcoal  for  45  min  at  45®C.  E2 
was  added  from  a  1  mM  stock  in  ethanol  to  a  final  concentration  of  10"®  M  for 
the  indicated  times.  Transfections  were  performed  with  Lipofectamine  according 
to  the  directions  of  the  manufacturer  (GIBCO-BRL).  For  stable  transfection, 
pcDNA3-kBa®*^-FLAG  or  CMV-FLAG-Bcl-3  was  introduced  into  MCF-7/ 
LCCl  cells,  and  clones  were  selected  in  medium  containing  50  \ig  of  G418  per 
ml  as  previously  described  (52).  Resistant  clones  were  picked  and  expanded,  and 
then  lysates  were  subjected  to  immunoblot  analysis  with  the  anti-FLAG  anti¬ 
body.  For  transient  cotransfections,  expression-reporter  constructs  and 
pcDNA3-LacZ  were  introduced  by  using  Superfect  or  Lipofectamine  according 
to  the  manufacturer’s  directions.  Cell  extracts  were  harvested  48  h  later  and 
analyzed  in  a  BioOrbit  1250  luminometer  by  using  luciferase  assay  reagent 
(Promega).  Reported  values  represent  means  ±  standard  errors  (SE)  from 
duplicate  or  triplicate  experiments,  normalized  to  LacZ  activity  determined  by 
methylumbelliferyl-p-glucuronide  assay  (1),  and  are  representative  of  those  from 
at  least  three  separate  experiments. 

EMSA  analysis.  Electrophoretic  mobility  shift  assays  (EMSA)  were  performed 
with  nuclear  extracts  from  cultured  cells  or  from  tumors  isolated  as  described  by 
Osborn  et  al.  (41).  NF-kB  site  oligonucleotides  were  obtained  from  Promega 
(E3291)  and  end  labeled  with  T4  polynucleotide  kinase  by  using  [7-®^P]ATP 
(Amersham).  Five  micrograms  of  nuclear  extract  was  mbced  with  5  pi  of  DNA 
binding  buffer  (20  mM  HEPES  [pH  7.9],  0.2  mM  EDTA,  0.2  mM  EGTA,  and  2 
mM  dithiothreitol  in  50%  glycerol),  5  pg  of  poly(dl-dC),  and  0.2  ng  of  labeled 
probe  in  a  final  volume  of  20  pi  and  then  incubated  at  room  temperature  for  25 
min.  Specific  bands  were  verified  with  a  10  to  125  M  excess  of  cold  oligonucle¬ 
otide  10  min  prior  to  addition  of  the  labeled  probe,  and  equivalence  of  extract 
loading  was  demonstrated  by  EMSA  with  a  DNA  fragment  containing  the  con¬ 
sensus  Spl  binding  site  (Promega).  Samples  were  loaded  on  a  5%  native  poly¬ 
acrylamide  gel  and  run  in  nondenaturing  Tris-glycine  buffer.  For  supershift 
experiments,  2  pg  of  each  antibody  was  added  to  extracts  and  left  for  1  h  prior 
to  addition  of  the  labeled  probe. 

Tumors  in  nude  mice.  Six-week-old  ovariectomized  nude  mice  (nu/nu  CD-I) 
were  implanted  subcutaneously  with  an  estrogen  release  pellet  (60-day-release 
pellet  containing  0.72  mg  of  E2;  Innovative  Research  of  America,  Sarasota,  Fla.). 
Two  days  later,  2  X  10*^  cells  derived  from  exponential  cultures  of  wild-type 
MCF-7  cells  were  injected  subcutaneously  into  the  flank  of  the  animal.  For  LCCl 
experiments,  three  pooled  clones  of  either  LCCl(kBa®^)  or  LCCl(pcDNA3) 
were  injected  subcutaneously  into  the  contralateral  flanks  of  12  animals.  Simi¬ 
larly,  cell  suspensions  containing  three  pooled  MCF-7(FLAG)  or  MCF- 
7(FLAG-Bcl-3)  clones  were  injected  subcutaneously  into  the  contralateral  flanks 
of  12  mice,  while  another  3  mice  received  only  MCF-7(FLAG-Bcl-3)  in  one  flank 
only.  Tumors  were  allowed  to  form  over  a  period  of  4  to  6  weeks,  and  volumes 
were  determined  by  caliper  measurements  as  previously  described  (44).  The  E2 
release  pellet  was  then  removed,  and  regression  was  monitored  until  the  tumor 
reached  50%  of  its  volume  at  pellet  removal  or  over  a  time  course  as  indicated, 
at  which  point  the  animal  was  sacrificed  and  the  tumor  was  removed.  Tumor 
protein  lysates  were  prepared  by  snap  freezing  followed  by  pulverization  under 
liquid  N2.  After  the  addition  of  radioimmunoprecipitation  assay  buffer  (50  mM 
Tris  [pH  8.0],  150  mM  NaCl,  0.1%  sodium  dodecyl  sulfate  [SDS],  0.5%  sodium 
deoxycholate,  1%  NP-40, 10  pg  of  phenylmethylsulfonyl  fluoride  per  ml,  1  pg  of 
aprotinin  per  ml,  and  0.02%  sodium  azide),  samples  were  sonicated  and  then 
incubated  for  30  min  on  ice  before  centrifugation  at  16,000  x  g  to  remove 
insoluble  material.  Protein  was  measured  with  Bio-Rad  reagent. 

Immunoblot  analysis.  Cell  monolayers  were  washed  twice  with  phosphate- 
buffered  saline  and  lysed  in  400  pi  of  RIPA  buffer  per  10"^  cells  for  30  min  on  ice. 
Insoluble  material  was  removed  following  centriftigation  at  12,000  X  g  for  15 
min,  and  soluble  protein  concentrations  were  determined  with  a  Bio-Rad  kit. 
Proteins  (20  pg)  were  separated  on  SDS-7.5  or  10%  polyacrylamide  gels  and 
transferred  to  polyvinylidene  difluoride  membranes.  After  exposure  to  primary 
antibody,  proteins  were  detected  with  peroxidase-conjugated  second  antibody 
(Sigma)  and  chemiluminescent  substrate  (Dupont,  NEN). 

Immunocytochemistry  and  ISEL.  Seven-micrometer  frozen  sections  were  cut 
from  LCCl(pcDNA3)  and  LCCl(kBa®^)  tumors.  For  Bax  immunostaining, 
sections  were  fixed  in  formaldehyde  for  30  min  and  incubated  with  polyclonal 
Bax  NH2  terminus  antibody  followed  by  CY3-conjugated  goat  anti-rabbit  immu¬ 
noglobulin  G  (Jackson  Laboratories).  In  situ  end  labeling  (ISEL)  was  performed 
with  terminal  transferase  and  biotin-1 6-dUTP  (Boehringer  Mannheim)  followed 
by  CY2-labeled  streptavidin  (Amersham)  as  described  previously  (44).  Sections 
were  visualized  and  imaged  with  a  Zeiss  Axiophot  fluorescence  microscope 
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TABLE  1.  Cell  lines 


Cell  line 

Description" 

MCF-7  (early  or  late) . 

well  differentiated,  ER  positive, 
p53‘*^^‘^  (39),  controlled  passages 
(early,  46  to  48  passages;  late, 

156  to  159  passages) 

T47-D . 

carcinoma,  differentiated 
epithelial,  ER  positive,  p53“''‘^ 

(39) 

Mill . 

. MCF-7  (controlled  passage) 

inoculated  into  ovariectomized 
nude  mice;  slow-growing  tumor 
isolated  6  months  after 
inoculation  and  reestablished  in 
vitro,  ER-positive  and  E2- 
responsive  in  vivo  (14) 

LCCl . 

. Isolated  from  rapidly  growing 

tumors  derived  from  second 
inoculation  of  Mill  cells  into 
ovariectomized  nude  mice  and 
then  reestablished  in  vitro;  ER- 
positive  and  E2-responsive  in 
vivo,  increased  constitutive 
expression  of  some  E2-regulated 
genes  (9) 

MDA-MB-231 . 

poorly  differentiated,  ER 
negative,  p53“'“  (39) 

MCF-7'^R(40F) . 

. MCF-7  cells  selected  for  40-fold 

resistance  to  doxorubicin 
(adriamycin)  [wild-type  MCF-7 
ED50,  14.5  nM;  MCF-7^*='*^(40F) 
ED50,  474  nM  (20)],  p53*"*^‘  (39), 
ER  negative 

"  ED50,  50%  effective  dose. 


equipped  with  Northern  Eclipse  software  (EMPIX  Imaging  Inc.,  Mississauga, 
Ontario,  Canada). 

RESULTS 

LCCl  cells  display  elevated  NF-kB  DNA  binding  activity. 

Evidence  shows  that  ER  expression  in  breast  cancer  cell  lines 
is  associated  with  low  baseline  NF-kB  activity,  while  breast 
cancer  cell  lines  devoid  of  ER  have  high  constitutive  levels  of 
NF-kB  activity  (36).  Although  LCCl  cells  are  E2  independent 
they  express  functional  ER  as  determined  by  their  increased 
growth  rate  in  the  presence  of  E2  (14).  Table  1  presents  a 
summary  of  the  ceil  lines  and  isolates  used  in  this  study.  In 
order  to  determine  whether  NF-kB  activity  correlates  with  E2 
dependence,  nuclear  extracts  from  ER"^  T47-D  and  ER"^ 
LCCl  cells  were  assayed  for  NF-kB  DNA  binding,  and  the 
NF-kB  activity  was  compared  with  that  in  MCF-7  cells  selected 
for  resistance  to  adriamycin  [MCF-7^*^(40F)  cells].  These 
cells  have  lost  expression  of  the  ER  and  are  able  to  form 
tumors  efficiently  in  ovariectomized  nude  mice  (20).  Figure  lA 
shows  the  results  of  an  EMSA  which  demonstrates  that,  as 
predicted  from  the  literature  on  NF-kB  levels  in  ER”  breast 
cancer  cell  lines,  MCF-7(40F)  cells  also  contain  high  levels  of 
constitutive  NF-kB  DNA  binding  activity  associated  with  fast- 


and  slow-migrating  complexes.  Conversely,  T47-D  cells  con¬ 
tained  very  low  levels  of  NF-kB  DNA  binding  activity,  as  pre¬ 
viously  reported  (36).  In  contrast,  LCCl  cells  displayed  inter¬ 
mediate  levels  of  NF-kB  activity  associated  primarily  with  the 
faster-migrating  complex.  In  order  to  directly  compare  NF-kB 
DNA  binding  activities  of  cells  derived  from  the  same  isolate 
of  MCF-7  cells,  we  prepared  nuclear  extracts  from  the  early 
passage  of  parental  MCF-7  cells.  Mill  cells,  and  LCCl  cells 
and  contrasted  these  NF-kB  DNA  binding  complexes  with 
those  from  ER”  MDA-MB-231  cells.  The  results  in  Fig.  IB 
show  again  that  LCCl  cells  had  high  levels  of  NF-kB  activity 
compared  with  the  other  MCF-7-derived  cells.  Similar  to 
MCF-7(40F)  cells,  MDA-MB-231  cells  contain  a  high  level  of 
constitutive  NF-kB  activity  associated  with  two  different  com¬ 
plexes.  The  EMSA  in  Fig.  1C,  in  which  an  Spl  binding  site 
DNA  fragment  was  used  as  a  probe,  contained  equivalent 
amounts  of  the  indicated  nuclear  extracts  used  for  Fig.  1 A  and 
B  and  demonstrates  that  quantitative  differences  were  not  a 
function  of  extract  loading.  To  determine  the  composition  of 
the  NF-kB  complexes  in  LCCl  cells  relative  to  ER”  cells,  we 
supershifted  nuclear  extracts  from  LCCl  and  the  ER”  MCF- 
7(40F)  and  MDA-MB-231  cells  with  antibodies  against  NF-kB 
proteins.  Figure  ID  shows  clearly  that  while  p50  was  present  in 
complexes  from  all  three  cell  lines,  the  ER”  lines  contained 
markedly  higher  levels  of  the  slower-migrating  p65/RelA  com¬ 
plex.  The  p50  antibody  also  supershifted  the  p65-associated 
complex,  thus  indicating  that  the  upper  complex  is  a  het¬ 
erodimer  of  p50  and  p65.  Thus,  NF-kB  subunits  appear  to  be 
differentially  activated  in  breast  cancer  cell  lines.  Despite  its 
commercial  designation  for  supershift  analysis,  we  were  unable 
to  supershift  any  of  the  complexes  with  this  p52  antibody  or 
with  an  antibody  obtained  from  the  laboratory  of  A.  S.  Baldwin 
which  reportedly  was  capable  of  supershifting  p52  but  only  on 
an  erratic  basis,  and  therefore  we  cannot  formally  rule  out  that 
the  lower-migrating  complex  also  contains  heterodimers  of 
p52/p50.  Note  that  all  of  the  cell  lines  were  cultured  with 
identical  serum  concentrations,  thereby  eliminating  the  influ¬ 
ence  of  differential  contributions  of  serum  growth  factors  on 
NF-kB  activity. 

Not  all  DNA-bound  NF-kB  is  transcriptionally  active  (56); 
therefore,  NF-kB  transactivational  activity  was  tested  by  tran¬ 
sient  transfection  of  an  NF-kB  reporter  gene.  The  graph  in  Fig. 
IE  shows  that  the  basal  level  of  NF-KB-luciferase  reporter 
gene  activity  in  LCCl  cells  was  about  fivefold  higher  than  that 
in  the  E2-dependent  MCF-7(early)  cells.  Since  our  supershift 
analysis  revealed  major  differences  between  the  predominant 
NF-kB  DNA  binding  complexes  in  ER"^  cells  and  ER”  cells, 
we  performed  immunoblot  analysis  on  nuclear  extracts  from 
these  cell  lines  to  ascertain  the  relative  levels  of  these  proteins. 
The  results  in  Fig.  IF  indicate  that  MCF-7(early)  nuclear  ex¬ 
tracts  contain  only  trace  levels  of  p65,  while  LCCl  nuclei 
contain  a  slightly  higher  level.  In  contrast,  the  ER”,  E2-inde- 
pendent  MCF-7(40F)  and  MDA-MB-231  cells  harbor  much 
higher  nuclear  levels  of  this  protein.  When  the  same  extracts 
were  assayed  for  p50  immunoreactivity,  significantly  less  p50 
was  present  in  MCF-7(early)  cells  than  in  the  other  cells.  The 
LCCl  and  the  ER”  cell  lines  contained  similar  nuclear  levels 
of  p50.  Given  that  p50  heterodimerizes  with  p65  in  these  cells, 
the  nuclear  level  of  p50  would  affect  both  the  fast-  and  slower- 
migrating  NF-kB  complexes.  The  relative  nuclear  content  of 
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FIG.  1.  NF-kB  DNA  binding  activity  increases  with  acquisition  of  E2  independence.  (A  and  B  NF-kB  complexes  in  5  jjug  of  nuclear  extract  from 
T47-D,  LCCl,  and  MCF-7(40F)  cells  (A)  and  from  MCF-7(early  passage),  Mill,  LCCl,  and  MDA-MB-231  cells  (B)  were  subjected  to  EMSA  as 
described  in  Materials  and  Methods,  using  an  oligonucleotide  containing  a  consensus  NF-kB  binding  site.  Arrows  indicate  fast-  and  slow-migrating 
complexes.  A  10-fold  excess  of  cold  oligonucleotide  (comp)  was  used  for  competition  of  specific  bands  in  duplicate  MCF-7(40F)  and  MDA-MB-231 
lanes.  (C)  Five  micrograms  of  the  same  nuclear  extracts  used  for  panels  A  and  B  was  subjected  to  EMSA  with  the  consensus  Spl  enhancer  element 
as  a  probe.  (D)  Antibody  supershift  analysis  of  NF-kB  complexes.  NF-kB  complexes  from  the  indicated  cell  lines  were  incubated  with  antibodies 
against  NF-kB  proteins  prior  to  DNA  binding  as  described  in  Materials  and  Methods.  Arrows  indicate  fast-  and  slow-migrating  complexes; 
arrowheads  identify  antibody-shifted  complexes.  NS,  normal  serum;  comp,  a  10-fold  excess  of  unlabeled  probe  was  used  to  compete  with  specific 
bands.  (E)  Results  from  transient  transfection  of  MCF-7(early)  and  LCCl  cells  with  the  SX-NF-KBj-luciferase  and  mutant-luciferase  reporter 
constructs.  The  results  represent  values  from  triplicate  experiments  ±  SE  and  were  normalized  to  p-galactosidase  activity  from  a  cotransfected 
pCMV-LacZ  plasmid.  (F)  Immunoblot  analysis  of  nuclear  p65,  p50,  and  p52  in  breast  cancer  cell  lines.  Ten  micrograms  of  each  nuclear  extract 
was  reacted  with  either  anti-p65  antibody  or  an  anti-p50  antibody  which  also  cross-reacts  with  p52. 


p52  detected  by  this  antibody  was  low  and  was  essentially 
equivalent  across  all  cell  lines.  Thus,  nuclear  NF-kB  proteins 
and  NF-kB  DNA  binding  complexes  are  both  qualitatively  and 
quantitatively  different  in  breast  cancer  cell  lines  and  isolates 
grown  in  vitro  under  the  same  culture  conditions.  Progression 


appears  to  correlate  with  increased  levels  of  NF-kB  consisting 
predominantly  of  p50  dimers  as  well  as  with  a  discernible 
increase  in  the  formation  of  p65/p50  DNA  binding  complexes. 

E2  regulates  NF-kB  DNA  binding  activity  in  vivo  and  in 
vitro.  Hormone-dependent  breast  tumors  undergo  regression 
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FIG.  2.  E2  regulates  NF-kB  DNA  binding  in  vitro  and  in  vivo.  (A)  Nuclear  extracts  (5  |ULg)  from  MCF-7  and  MCF-7/LCC1  cells  cultured  in 
DMEM  (lanes  C)  or  in  E2-free  medium  for  the  indicated  number  of  days  were  subjected  to  EMSA  with  the  NF-kB  oligonucleotide.  A  10-fold 
excess  of  cold  competing  oligonucleotide  (comp)  preferentially  competes  the  specific  NF-kB  complexes,  n.s.,  nonspecific  band.  The  lower  panel 
shows  a  gel  shift  of  the  Spl  consensus  oligonucleotide  with  the  same  extracts  to  control  for  extract  loading.  (B)  MCF-7  tumors  were  grown  in 
ovariectomized  nude  mice  implanted  with  an  E2  release  pellet.  Tumors  were  obtained  prior  to  E2  pellet  removal  (control)  and  at  the  indicated 
times  following  pellet  removal.  Nuclear  extracts  were  incubated  with  a  labeled  NF-kB  oligonucleotide  and  subjected  to  EMSA  as  described  in 
Materials  and  Methods.  A  50-times  concentration  of  cold  oligonucleotide  (comp)  was  used  to  compete  specific  binding.  In  the  lower  panel  the  same 
extracts  were  assayed  for  binding  to  the  consensus  Spl  element.  (C)  Tumor  extracts  from  day  6  after  the  E2  pellet  removal  described  above  were 
used  for  supershift  analysis  with  antibodies  (Ab)  against  p50  and  p65.  Arrowheads  indicate  supershifted  complexes.  Ns,  normal  serum.  (D)  MCF-7 
cells  cultured  in  stripped  medium  without  E2  for  10  days  were  treated  with  E2  or  vehicle  (-)  for  3  days.  Nuclear  extracts  were  isolated,  and  5  p.g 
was  subjected  to  EMSA  with  the  NF-kB  site.  Protein  binding  to  the  Spl  element  from  the  same  extracts  is  shown  in  the  panel  on  the  right. 


after  E2  removal  by  ovariectomy  or  following  antiestrogen 
therapy  as  the  result  of  programmed  cell  death  of  a  large 
majority  of  the  cells  (29, 44).  A  subset  of  these  cells  will  escape 
apoptosis  as  a  result  of  acquisition  of  hormone  independence 


and/or  antiestrogen  resistance.  Since  steroid  hormones  are 
known  to  modulate  the  levels  of  NF-kB  activity,  we  tested  the 
effects  of  growth  in  E2-free  medium  on  the  NF-kB  DNA  bind¬ 
ing  activity  of  MCF-7  and  LCCl  cells.  Figure  2A  shows  that 
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NF-kB  activity  began  to  increase  in  MCF-7  cells  within  4  days 
after  the  medium  was  replaced  with  E2-free  medium  and  con¬ 
tinued  to  rise  over  the  12-day  period  studied.  Thus,  NF-kB 
activity  in  ER"^  MCF-7  breast  cancer  cells  is  highly  responsive 
to  removal  of  E2  in  vitro.  As  expected,  LCCl  cells  have  a  high 
constitutive  level  of  NF-kB  DNA  binding  activity,  which  un¬ 
derwent  a  slight  increase  within  12  days  following  culture  in 
E2-free  medium.  In  order  to  determine  whether  this  increased 
level  of  NF-kB  activity  also  occurs  in  MCF-7  tumors  in  vivo,  we 
grew  MCF-7  tumors  in  nude  mice  implanted  with  E2  release 
pellets  and  isolated  nuclear  extracts  from  a  solid  tumor  every 
day  for  7  days  and  again  at  18  days  after  pellet  removal.  The 
EMSA  in  Fig.  2B  shows  that,  compared  with  an  that  of  an 
actively  growing  tumor  in  an  animal  implanted  with  an  E2 
release  pellet,  NF-kB  activity  began  to  increase  within  3  days 
following  E2  pellet  removal  and  rose  continuously  to  a  maxi¬ 
mum  within  the  7-day  period.  As  the  tumors  regressed,  MCF-7 
NF-kB  DNA  binding  activity  remained  elevated  in  nuclear 
extracts  for  18  days  after  E2  pellet  removal.  To  profile  NF-kB 
complexes  in  vivo  from  E2-depleted  MCF-7  xenografts,  we 
similarly  assayed  DNA  binding  in  nuclear  extracts  from  re¬ 
gressing  MCF-7  cell  tumors.  Supershift  analysis  of  nuclear 
extracts  from  the  MCF-7  tumor  excised  at  day  6  following  E2 
release  pellet  removal  (Fig.  2C)  shows  again  that  although  the 
NF-kB  activity  was  primarily  p50/p50  (or  possibly  p50/p52),  an 
anti-p65  reactive  complex  was  also  present.  Thus,  both  NF-kB 
complexes  are  also  detectable  in  regressing  tumors  and  could 
potentially  participate  in  the  evolution  of  the  E2-independent 
phenotype.  Taken  together,  these  results  show  that  E2  removal 
both  in  vivo  and  in  vitro  is  a  potent  stimulus  in  breast  cancer 
cells  of  NF-kB  activity,  which  might  then  contribute  to  E2 
independence.  One  interpretation  of  this  result  is  that  E2  re¬ 
moval  results  in  survival  of  cells  with  high  endogenous  levels  of 
NF-kB  rather  than  an  induction  of  NF-kB  activity.  In  order  to 
address  this  issue,  we  tested  the  reversibility  of  the  effect  of  E2 
withdrawal  on  NF-kB  DNA  binding.  MCF-7  cells  were  grown 
in  E2-free  medium  for  10  days  and  then  treated  with  vehicle  or 
E2  for  72  h.  The  results  in  Fig.  2D  demonstrate  that,  as  ex¬ 
pected,  NF-kB  activity  was  high  in  MCF-7  cells  cultured  with¬ 
out  E2.  In  contrast,  levels  were  strongly  reduced  in  cells  which 
had  be  reexposed  to  E2,  thus  indicating  regulation  of  NF-kB 
binding  rather  than  an  altered  composition  of  the  cell  popu¬ 
lation. 

Expression  of  ER  proteins  in  MCF-7  sublines.  Previous 
work  by  Nakshatri  et  al.  (36)  showed  that  loss  of  ER  expression 
correlated  with  the  acquisition  of  constitutive  NF-kB  activity. 
As  LCCl  cells  are  E2  independent,  it  was  possible  that  these 
cells  had  in  fact  begun  to  downregulate  ER  expression,  thus 
resulting  in  higher  levels  of  NF-kB  activity.  We  (this  work)  and 
others  (11,  17)  have  reported  that  E2  can  modulate  NF-kB 
activity;  thus,  alterations  of  ER  levels  in  the  presence  or  ab¬ 
sence  of  E2  might  indirectly  affect  NF-kB  activity.  In  order  to 
determine  the  relative  expression  levels  of  both  of  the  ERs  and 
to  see  whether  E2  alters  these  levels,  we  performed  immuno- 
blot  analysis  of  LCCl  cells,  Mill  cells,  MCF-7(early)  cells,  and 
MCF-7  cells  of  undetermined  passage  treated  with  E2  or  ve¬ 
hicle  for  72  h.  Figure  3  shows  that,  as  expected,  all  cell  lines 
express  the  68-kDa  ERa.  Interestingly,  LCCl  cells  express  a 
slightly  higher  level  of  ERa  than  the  other  cells.  Similarly, 
immunoblotting  for  ERp  indicated  that  all  of  the  cell  lines 


LCCl  Mill  M(eaHy) 
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FIG.  3.  Immunoblot  analysis  of  ER  proteins.  Twenty  micrograms 
of  whole-cell  protein  extract  from  MCF-7  and  derived  sublines  treated 
with  E2  or  vehicle  for  72  h  was  separated  by  SDS-polyacrylamide  gel 
electrophoresis  and  immunoblotted  with  antibodies  against  ERa  and 
ERp.  The  expression  of  actin  was  used  as  an  internal  loading  control. 
Numbers  on  the  right  indicated  molecular  masses  in  kilodaltons. 


expressed  a  66-kDa  protein  corresponding  to  ERp.  Again, 
levels  appeared  elevated  in  LCCl  cells.  There  was  no  indica¬ 
tion  that  either  of  the  ERs  was  subject  to  E2  regulation  in  Mill 
or  MCF-7  cells;  however,  E2  reproducibly  increased  both  ERa 
and  ERp  of  MCF-7(early)  cells  and  the  level  of  ERp  in  LCCl 
cells  above  levels  in  vehicle-treated  cultures.  Thus,  despite  this 
marked  increase  in  constitutive  NF-kB  activity  in  LCCl  cells, 
they  retain  high-level  expression  of  ERa  and  ERp. 

Estrogen  regulates  Bcl-3  protein  expression.  The  supershift 
experiments  described  above  indicated  that  the  NF-kB  com¬ 
plex  in  LCCl  cells  is  composed  primarily  of  p50,  which  in  itself 
is  transcriptionally  inactive;  however,  activation  of  the  tran¬ 
scriptional  function  of  this  complex  has  been  shown  to  occur 
through  complex  formation  with  Bcl-3  (31).  If  p50  complexes 
are  to  be  active  in  tumors  following  E2  removal,  the  accessory 
factor  Bcl-3  might  contribute  to  this  activity.  To  investigate 
this,  we  subjected  protein  extracts  from  MCF-7  tumors  to 
immunoblot  analysis  at  intervals  after  removal  of  the  E2  re¬ 
lease  pellet.  The  results  in  Fig.  4A  show  that  control  tumors 
contained  low  levels  of  Bcl-3,  while  the  levels  rose  significantly 
within  1  day  after  E2  removal  and  remained  elevated  for  the 
balance  of  the  study.  To  assess  whether  Bcl-3  expression  in¬ 
duced  by  E2  removal  was  principally  due  to  the  absence  of  E2 
or  was  secondary  to  other  effects  on  the  tumor  milieu,  we 
analyzed  extracts  from  LCCl  cells  cultured  in  E2-free  or  E2- 
supplemented  medium  over  several  days.  The  results  in  Fig.  4B 
show  that  culture  of  LCClcells  in  the  absence  of  E2  increased 
the  expression  of  Bcl-3  within  1  day  compared  with  culture  in 
E2-supplemented  medium.  The  differential  expression  was  al¬ 
ready  maximal  by  2  days  and  remained  so  throughout  the  4-day 
study  period.  Since  NF-kB  DNA  binding  is  constitutively 
higher  in  E2-independent  LCCl  cells  and  MDA-MB-231  cells 
than  in  MCF-7  cells  and  Mill  cells  in  the  early  stage  of  E2 
independence,  we  wished  to  determine  whether  basal  levels  of 
Bcl-3  might  follow  the  same  pattern.  Figure  4C  shows  a  West¬ 
ern  blot  analysis  of  Bcl-3  levels  in  extracts  from  early-  and 
late-passage  MCF-7  cells,  Mill  cells,  LCCl  cells,  and  several 
ER“  cells.  The  results  indicate  that  levels  of  Bcl-3  are  highest 
in  the  E2-independent  lines,  including  LCCl  cells.  A  second 
comparison  of  Bcl-3  levels  in  MCF-7(late),  MCF-7(40F)  and 
MDA-MB-468  cells  (Fig.  4D)  shows  that,  unexpectedly,  MCF- 
7(40F)  cells  express  Bcl-3  at  approximately  the  same  level  as 
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FIG.  4.  BcI-3  is  regulated  by  E2  in  breast  cancer  cells  in  vivo  and  in  vitro.  (A)  Bcl-3  expression  in  MCF-7  breast  cancer  tumors  was  determined 
by  immunoblotting  of  15  (xg  of  tumor  protein  extracts  on  the  indicated  days  after  E2  release  pellet  removal.  The  molecular  mass  in  kilodaltons 
is  shown  on  the  right.  (B)  Basal  expression  of  Bcl-3  in  E2-dependent  and  -independent  breast  cancer  cells.  Whole-cell  protein  extracts  were  isolated 
from  the  indicated  breast  cancer  cell  lines  grown  in  appropriate  phenol  red-containing  medium.  In  all  experiments,  15  jxg  of  each  extract  was 
subjected  to  immunoblot  analysis  with  the  Bcl-3  antiboefy  and  either  actin  or  GAPDH  (glyceraldehyde-3-phosphate  dehydrogenase)  was  used  as 
a  protein  loading  control.  (C)  Immunoblot  to  detect  Bcl-3  expression  in  vitro  in  the  presence  and  absence  of  E2.  LCCl  cells  were  cultured  in 
DMEM  or  E2-ftee  medium  supplemented  with  E2  or  vehicle  as  described  in  Materials  and  Methods  for  the  indicated  times.  Lane  c,  control 
consisting  of  extracts  from  cells  cultured  in  unstripped  medium  containing  phenol  red.  (D)  Comparison  of  Bcl-3  expression  in  the  indicated  cell 
lines  as  described  for  panel  C. 


MCF-7  cells.  However,  it  is  important  to  note  that  MCF- 
7(40F)  cells  were  selected  not  for  E2  independence  but  rather 
for  adriamycin  resistance  and  therefore  do  not  necessarily  rep¬ 
resent  a  natural  stage  in  hormone  progression.  Moreover,  we 
cannot  rule  out  that  regulation  of  Bcl-3  expression  in  other 
ER“  cell  lines  is  accomplished  by  the  same  mechanism  as  in 
ER"^  cell  lines. 

Bcl-3  augments  MCF-7  tumor  growth  but  not  E2  indepen¬ 
dence.  Bcl-3  has  recently  been  shown  to  stimulate  growth  as 
well  as  provide  a  survival  function  in  some  cells  (35,  47,  55). 
The  results  described  above  clearly  show  that  p50/50  (and 
possibly  p50/p52)  complexes  predominate  in  LCCl  cells,  and 
the  increased  expression  of  Bcl-3  suggests  that  this  protein 
could  play  a  role  in  E2  independence.  To  assess  the  sufficiency 
of  Bcl-3  expression  in  conferring  E2  independence,  we  stably 
transfected  MCF-7(earIy)  cells  with  a  FLAG-tagged  Bcl-3  ex¬ 
pression  construct.  Figure  5A  shows  an  immunoblot  of  lysates 
from  three  pooled  MCF-7(FLAG-Bcl-3)  clones  and  three 
pooled  MCF-7(FLAG)  control  clones  reacted  with  anti-FLAG 
to  detect  transfected  protein  (upper  panel)  and  anti-Bcl-3  (sec¬ 
ond  panel)  in  order  to  compare  the  relative  levels  of  Bcl-3  in 
these  clones.  Bcl-3  expression  has  been  associated  with  induc¬ 
tion  of  the  c^clin  D1  gene,  and  immunoblot  analysis  of  the 
same  nuclear  extracts  with  an  anti-cyclin  D1  antibody  showed 
that  the  constitutive  level  of  cyclin  D1  expression  was  increased 
in  the  transfected  cells  (third  panel).  Nuclear  extracts  from 
these  pooled  clones  were  also  used  for  an  EMSA  to  detect 
NF-kB  DNA  binding  activity.  Figure  5B  shows  that,  consistent 
with  its  ability  to  interact  with  p50,  the  p50  complex  was  aug¬ 
mented  in  MCF-7(FLAG-Bcl-3)  clones  compared  with  con¬ 
trols  when  equal  amounts  of  nuclear  extract  were  assayed, 
suggesting  a  possible  stabilization  of  the  p50  DNA  binding 


complex.  We  then  used  both  MCF-7(FLAG-Bcl-3)  and  MCF- 
7(FLAG)  clones  to  generate  tumors  in  ovariectomized  nude 
mice  implanted  with  an  E2  release  pellet  as  described  in  Ma¬ 
terials  and  Methods.  The  growth  of  control  tumors  [MCF- 
7(FLAG)]and  MCF-7(FLAG-Bcl-3)  tumors  was  monitored 
over  36  days,  and  the  results  are  shown  in  Fig.  5C.  Eleven  of 
the  15  sites  injected  with  MCF-7(FLAG-Bcl“3)  cells  produced 
tumors.  On  the  other  hand,  only  4  of  12  control  sites  formed 
tumors  over  the  same  time  period.  Of  the  tumors  that  formed, 
the  mean  volumes  (±  SE)  were  258  ±  45  mm^  for  MCF- 
7(FLAG-Bcl-3)  tumors  and  67  ±  24  mm^  for  MCF-7(FLAG) 
tumors.  Tumor  regression  was  monitored  after  E2  release  pel¬ 
let  removal  until  the  tumor  was  50%  of  its  original  size  at  pellet 
removal.  If  Bcl-3  conferred  E2  independence,  we  expected  that 
MCF-7(Bcl-3)  tumors  would  either  remain  stable  or  continue 
to  grow.  However,  comparison  with  control  MCF-7(FLAG) 
tumors  showed  virtually  identical  regression  rates,  requiring  14 
days  for  the  latter  tumors  and  15.5  days  for  MCF-7(FLAG- 
Bcl-3)  tumors  to  regress  to  50%  of  the  original  tumor  volume. 
Thus,  Bcl-3  alone  does  not  render  MCF-7  tumors  stable  after 
E2  withdrawal,  suggesting  that  Bcl-3-mediated  increases  in  p50 
and/or  p52  activity  are  either  of  insufficient  magnitude  or  can¬ 
not  confer  an  E2-independent  phenotype.  In  contrast,  the 
higher  rate  of  tumor  establishment  and  rapid  tumor  growth  of 
MCF-7(FLAG-Bcl-3)  cells  compared  with  control  cells  shows 
that  Bcl-3  can  augment  the  growth  and  tumorigenicity  of 
breast  cancer  cells. 

Induction  of  both  p65-  and  p50-associated  activity  after  £2 
withdrawal  in  vitro  and  in  vivo.  The  relative  levels  of  p65  and 
p50  in  nuclei  from  MCF-7(early)  and  LCCl  cells  and  the 
results  in  Fig.  IB  suggested  that  there  were  both  qualitative 
and  quantitative  differences  between  NF-kB  complexes  in  the 
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FIG.  5.  Bcl-3  promotes  E2-dependent  MCF-7  tumor  growth.  (A)  MCF-7  cells  were  transfected  with  either  an  empty  FLAG  vector  or 
FLAG-BcI-3.  Shown  is  immunoblot  analysis  of  cell  extracts  from  three  pooled  clones  of  each  transfectant  with  an  anti-FLAG  antibody.  The  lower 
panel  shows  a  duplicate  blot  following  incubation  with  anti-Bcl-3  to  detect  both  endogenous  and  transfected  protein.  Sizes  of  molecular  mass 
markers  are  shown  on  the  left  in  kilodaltons.  (B)  EMSA  to  assess  the  effects  of  Bcl-3  overexpression  on  NF-kB  DNA  binding.  Five  micrograms 
of  nuclear  extract  from  pooled  MCF-7(FLAG)  and  MCF-7(FLAG-Bcl-3)  clones  was  used  for  gel  shift  analysis  with  either  the  NF-kB  or  Spl 
sequence  as  a  probe.  Samples  were  assessed  simultaneously  on  the  same  gel.  (C)  Tumor  volumes  derived  from  MCF-7(FLAG)  and  MCF-7(FLAG- 
BcI-3)  cells  36  days  after  cell  inoculation.  Mice  1  to  12  were  injected  subcutaneously  with  pooled  clones  of  control  or  MCF-7(FLAG-BcI-3)  on 
either  flank.  Mice  13  to  15  received  only  an  MCF-7(FLAG-Bcl-3)  inoculation. 


hormone-dependent  and  hormone-independent  cells.  To  char¬ 
acterize  the  NF-kB  complexes  in  these  two  isolates,  it  was 
necessary  to  use  MCF-7(early)  cells  which  were  cultured  in 
E2-free  medium,  since  without  E2  withdrawal  there  was  insuf¬ 
ficient  NF-kB  DNA  binding  activity  to  evaluate.  We  performed 
an  EMSA  and  supershift  analysis  of  nuclear  extracts  from 
MCF-7(early)  and  LCCl  cells  which,  for  consistency,  were 


both  derived  from  culture  in  E2-depleted  medium  for  12  days 
(Fig.  6).  Supershift  analysis  with  the  p50  antibody  again  re¬ 
vealed  the  presence  of  a  major  complex  containing  p50,  while 
the  p52  antibody  was  unable  to  supershift  any  complex.  Im¬ 
portantly,  the  p65  antibody  identified  a  strong  p65-associated 
complex  in  the  LCCl  cells.  By  contrast,  the  level  of  p65  DNA 
binding  activity  was  significantly  lower  in  the  MCF-7(early) 
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FIG.  6.  Induction  of  both  p65-  and  pSO-associated  NF-kB  activity  following  E2  withdrawal.  Nuclear  extracts  were  collected  from  MCF-7(early) 
and  LCCl  cells  cultured  for  12  days  in  phenol  red-free  medium  with  charcoal-stripped  serum  to  induce  NF-kB  activity.  Five  micrograms  of  each 
extract  was  used  for  EMSA  with  the  NF-kB  probe  and  antibody  supershift  analysis  with  the  indicated  antibodies  (Ab).  The  same  quantity  of  extract 
was  also  subjected  to  EMSA  with  the  Spl  probe  as  a  control  for  extract  loading. 


nuclear  extracts.  Taken  together  with  the  nuclear  expression  of 
p65  and  p50  in  MCF-7  and  LCCl  cells,  these  results  show  that 
after  E2  withdrawal,  LCCl  cells  contain  significantly  more 
p65-associated  activity  than  do  their  E2-dependent  counter¬ 
parts. 

LCCl  cells  constitutively  expressing  iKBa  revert  to  an  es¬ 
trogen-dependent  tumor  phenotype.  In  order  to  test  the  hy¬ 
pothesis  that  NF-kB  activity  contributes  significantly  to  the 
E2-independent  phenotype,  we  stably  transfected  LCCl  cells 
with  a  FLAG- tagged  degradation-resistant  form  of  the  NF-kB 
inhibitor  IkBk  called  IkBcx^*^  (27).  iKBa  most  effectively  in¬ 
hibits  p65-containing  complexes  (26),  although  some  minimal 
inhibition  of  p50/p50  activity  has  also  be  documented  (30). 
Figure  7A  shows  immunoblot  analysis  with  the  anti-FLAG 
monoclonal  antibody  of  four  positive  clones  and  control  ex¬ 
tracts  from  pcDNA3-transfected  LCCl  cells.  The  effects  of 
iKBa®^  expression  on  relative  levels  of  nuclear  NF-kB  com¬ 
plexes  were  tested  by  supershift  analysis.  Figure  7B  shows  that 
iKBa^^  expression  had  little  effect  on  the  abundant,  fast-mi- 
grating  p50-containing  complex.  In  contrast,  the  p65  complex 
was  virtually  absent  from  the  LCCl(lKBa^^)  cells.  To  confirm 
that  any  reversion  of  LCCl(lKBa^*^)  tumors  to  E2  dependence 
was  associated  with  decreased  nuclear  levels  of  p65,  we  per¬ 
formed  immunoblot  analysis  of  nuclear  extracts  from 
LCCl(pcDNA3)  and  LCCl(lKBa^*^)  tumors.  Figure  1C  shows 
that  p65  levels  are  markedly  reduced  in  LCCl(lKBa^^)  tumors 
compared  with  controls.  On  the  other  hand,  both  the  p50  and 
p52  nuclear  contents  remained  unaltered  in  these  tumor  types. 
The  p50  antibody,  which  simultaneously  detects  p52  in  extracts 
from  cultured  cells,  did  not  detect  p52  in  the  tumor  extracts.  To 
assess  p52,  we  utilized  a  separate  anti-p52  antibody,  which 
indicated  that  the  levels  of  nuclear  p52  were  unchanged  in 


iKBa^^-expressing  tumors.  Evaluation  of  the  effects  of  iKBa^^ 
on  NF-kB  transcriptional  activity  following  transient  transfec¬ 
tion  of  the  three  highest-expressing  clones  with  the  3X-NF-kB 
reporter  gene  or  the  same  reporter  gene  containing  a  mutant 
NF-kB  response  element  showed  that  transcriptional  activity 
in  the  three  LCCl(lKBa^^)  clones  ranged  from  20-  to  100-fold 
less  than  that  in  the  control  LCCl(pcDNA3)  clones  (Fig.  7D). 
We  then  inoculated  three  pooled  LCCl(pcE)NA3)  clones  and 
pooled  LCCl(lKBa^^)  clones  1, 3,  and  4  into  the  contralateral 
sides  of  ovariectomized  nude  mice  implanted  with  an  E2  re¬ 
lease  pellet.  Tumors  from  both  groups  grew  at  various  rates 
under  these  conditions  of  hormone  replacement,  and  there  was 
no  significant  difference  in  the  final  tumor  volumes  [mean  ± 
standard  deviation,  183  ±  88  mm^  for  LCCl(pcDNA3)  tumors 
and  160  ±  88  mm^  for  LCCl(lKBa®^)  tumors].  Tumor  vol¬ 
umes  were  measured  2  weeks  after  E2  pellet  removal,  and  the 
percent  regression  was  calculated.  Figure  7E  is  a  histogram 
showing  the  results  of  this  analysis.  While  LCCl(pcDNA3) 
tumor  volumes  either  remained  stable  or  continued  to  grow 
after  pellet  removal,  LCCl  (iKBa®^)  tumors  underwent  sig¬ 
nificant  regression.  In  cells  engaged  in  apoptosis  through  the 
mitochondrial  death  pathway,  the  Bax  protein  undergoes  a 
conformational  change  exposing  its  otherwise  buried  N  termi¬ 
nus  associated  with  mitochondrial  translocation  (37).  Figure 
7F  depicts  sections  from  LCCl(pcDNA3)  and  LCCl(lKBa®^) 
tumors  immunostained  with  an  antibody  against  the  N  termi¬ 
nus  of  Bax,  showing  high  levels  of  translocated  Bax  in  the 
LCC^lKBa^’^)  tumors  but  not  in  control  tumors.  Moreover, 
detection  of  DNA  fragmentation  in  apoptotic  cells  by  using 
ISEL  showed  that  LCCl(lKBa^^)  tumors  contained  large 
numbers  of  apoptotic  nuclei,  while  LCCl(pcDNA3)  tumors 
had  almost  none.  Thus,  NF-kB  inhibition  is  sufficient  to  restore 
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FIG.  7.  MCF-7/LCC1  cells  constitutively  expressing  lose  p65/p50  activity  and  revert  to  an  E2-dependent  tumor  phenotype.  (A)  Im- 

munoblot  analysis  to  detect  FLAG-tagged  kBa^*^  expression  in  stable  LCCl  clones  with  an  anti-FLAG  monoclonal  antibody.  Clones  are 
designated  1  to  4.  CON,  control.  (B)  Five  micrograms  of  nuclear  extract  from  LCCl(pcDNA3)  and  LCCl(lKBa^^)  cell  cultures  was  subjected  to 
EMSA  and  antibody  (Ab)  supershift  analysis  as  described  in  Materials  and  Methods.  NS,  normal  serum.  (C)  Nuclear  extracts  (10  p,g)  from 
LCCl(lKBa^^)  and  LCCl(pcDNA3)  tumors  were  analyzed  by  imraunoblotting  for  expression  of  p65,  p50,  and  p52.  The  p50  antibody  did  not 
identify  a  p52  band  from  these  extracts,  and  therefore  a  p52“Specific  antibody  was  used  to  assess  changes  in  expression  between  clones.  (D)  NF-kB 
activity  was  evaluated  in  LCCl(lKBa^*^)  clones  1, 3,  and  4  (from  left  to  right)  following  transient  cotransfection  of  the  NF-KB-luciferase  and  LacZ 
reporter  genes.  The  results  presented  are  the  averages  from  two  experiments,  expressed  in  arbitraiy  units,  and  are  normalized  to  p-galactosidase 
activity.  (E)  Three  pooled  clones  each  of  LCCl(pcDNA3)  and  LCCl(lKBa®^)  cells  were  inoculated  into  ovariectomized  nude  mice  implanted  with 
an  E2  release  pellet.  The  histogram  shows  the  percent  regression  of  LCCl  (n  -  11)  and  LCCl(lKBa®^)  (n  =  9)  tumors  following  removal  of  the 
release  pellet.  Bars  represent  standard  errors.  (F)  Sections  of  LCCl  and  LCCl(lKBa^*^)  tumors  were  subjected  to  ISEL  to  detect  free  3 '-OH  DNA 
ends  in  apoptotic  nuclei  or  immunostained  with  an  antibody  against  the  NH2  terminus  of  Bax  to  detect  activated  mitochondrial  Bax. 
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FIG.  8,  Expression  of  downregulates  BcI-2  expression. 

(A)  Whole-cell  extracts  from  LCCl(pcDNA3)  (lanes  P)  and 
LCCl(lKBa®*^)  (lanes  I)  cell  tumors  at  different  days  following  E2 
pellet  removal  were  analyzed  by  immunoblotting  for  expression  of 
Bcl-2.  Blots  were  stripped  and  reacted  with  antibody  against  Bcl-x. 
Loading  of  the  gel  was  controlled  for  by  reactivity  with  an  actin  anti¬ 
body.  (B)  LCCl  cells  stably  transfected  with  pcDNA3  or  iKBa®^  to¬ 
gether  with  Mill  and  M(late)  cells  were  cultured  in  E2-free  medium 
for  10  days  and  then  treated  with  E2  or  vehicle  for  48  h.  E2  regulation 
of  Bcl-2  expression  in  vitro  in  the  MCF-7-derived  cells  and  clones  was 
assessed  by  immunoblotting  of  15  |xg  of  cell  extract.  Actin  was  used  as 
an  internal  loading  control  for  all  blots.  Molecular  masses  in  kilodal- 
tons  are  shown  on  the  right. 


an  E2-dependent  phenotype  to  LCCl  tumors,  rendering  them 
apoptotic  after  E2  removal  through  a  mitochondrial  pathway 
of  cell  death  involving  Bax  translocation. 

Effect  of  NF-kB  inhibition  on  E2-regulated  genes.  Recent 
work  has  shown  that  expression  of  both  of  the  antiapoptotic 
proteins  Bcl-2  and  Bcl-x  is  induced  by  NF-kB  in  neurons  (10, 
51).  Since  LCCl(I-KBa®^)  tumors  undergo  apoptotic  regres¬ 
sion  following  E2  withdrawal,  we  tested  whether  the  reduced 
NF-kB  in  these  cells  was  associated  with  a  reduction  in  Bcl-2 
and/or  Bcl-x.  We  first  considered  the  expression  of  these  pro¬ 
teins  in  the  respective  tumors  on  various  days  following  E2 
release  pellet  removal.  The  results  in  Fig.  8A  show  that  Bcl-2 
protein  levels  were  strongly  reduced  in  LCCl(lKBa®^)  cells 
compared  with  LCCl(pcDNA3)  cells  for  the  entire  period 
after  withdrawal  of  E2.  In  contrast,  Bcl-x  expression  in  the  two 
tumor  types  was  indistinguishable.  During  the  initial  charac¬ 
terization  of  Mill  and  LCCl  cells,  it  was  found  that  the  ex¬ 
pression  of  a  number  of  normally  E2-responsive  genes  was 
constitutively  upregulated  and  no  longer  responsive  to  E2  (9). 
This  was  also  true  of  another  independent  MCF-7  subline 


selected  for  E2-independent  growth  (13).  E2  can  also  regulate 
expression  of  Bcl-2  (43,  52),  although  the  promoter  for  this 
gene  is  complex  and  may  also  be  regulated  by  mutant  p53.  A 
comparison  of  the  expression  of  Bcl-2  and  its  regulation  by  E2 
in  vitro  is  shown  in  Fig.  8B.  Unlike  for  several  other  E2- 
regulated  gene  products,  LCCl  cells  contain  lower  levels  of 
Bcl-2  than  do  parental  MCF-7  cells.  This  expression  is  refrac¬ 
tory  to  E2  regulation.  As  observed  in  the  tumors,  stable  ex¬ 
pression  of  iKBa®^  further  reduced  Bcl-2  expression  in  vitro, 
and  this  was  accompanied  by  a  weak  reinstatement  of  E2 
regulation.  Importantly,  we  have  previously  shown  that  consti¬ 
tutive  Bcl-2  expression  is  sufficient  to  prevent  regression  of 
MCF-7  tumors  in  nude  mouse  xenografts  (44).  Taken  together, 
these  results  suggest  that  the  reduction  in  Bcl-2  expression  in 
LCCl(lKBa^^)  tumors  could  contribute  to  the  E2  withdrawal- 
induced  apoptosis  and  subsequent  regression. 

DISCUSSION 

The  events  which  precede  the  loss  of  E2  dependence  and  a 
more  tumorigenic  phenotype  in  breast  cancer  cells  are  not 
understood.  The  LCCl  subline  of  the  E2-dependent  human 
breast  cancer  cell  line  MCF-7  was  selected  under  conditions  of 
E2  depletion,  resulting  in  an  experimental  model  of  progres¬ 
sion  in  that  these  cells  grow  in  a  hormone-independent  manner 
but  retain  wild-type  levels  of  ER  expression.  Apart  from  the 
observation  that  some  E2-regulated  genes  are  constitutively 
expressed  in  these  cells,  little  is  known  about  the  cellular  events 
that  contribute  to  the  loss  of  E2  dependence.  While  it  is  clear 
that  NF-kB  activity  rises  in  the  context  of  carcinogen-induced 
mammary  epithelial  cell  transformation  (28),  in  general  there 
is  also  a  marked  increase  in  constitutive  NF-kB  activity  in  ER“ 
cells  compared  with  ER'^  cells  (36).  Using  E2-independent 
LCCl  cells,  the  work  described  here  shows  that  an  increase  in 
constitutive  NF-kB  activity  occurs  in  cells  selected  for  E2  in¬ 
dependence  prior  to  loss  of  ER  expression.  Moreover,  inhibi¬ 
tion  of  this  NF-kB  activity  is  sufficient  to  confer  sensitivity  to 
estrogen  removal  on  LCCl  tumors  through  a  mitochondrial 
death  pathway.  This  observation  also  clearly  indicates  that 
LCCl  cells  have  not  sustained  damage  to  their  apoptotic  ma¬ 
chinery  during  progression  toward  E2  independence.  In  con¬ 
trast  to  K-BALB  murine  fibrosarcoma  cells  and  mouse  mam¬ 
mary  tumor  cells,  in  which  tumorigenicity  was  completely 
blocked  by  inhibition  of  NF-kB  activity  (7,  22),  at  least  in 
animals  supplemented  with  E2,  LCCl(lKBa®^)  cells  were  able 
to  form  tumors  as  well  as  control  cells.  Thus,  inhibition  of 
NF-kB  activity  through  iKBa  does  not  interfere  with  E2-driven 
tumorigenicity. 

The  process  of  selection  requires  epigenetic  changes  which 
render  some  cells  resistant  to  the  selective  pressure.  Therefore, 
to  investigate  the  mechanism  by  which  this  occurs,  we  have 
analyzed  the  consequence  of  E2  removal  on  NF-kB  activity 
both  in  vitro  and  in  tumors.  The  reduction  in  E2  levels  follow¬ 
ing  removal  of  the  E2  pellet  in  ovariectomized  athymic  mice 
and  culture  in  E2-free  medium  results  in  a  rapid  increase  in 
NF-kB  levels  in  MCF-7  cells.  The  kinetics  were  fastest  in  vivo, 
which  may  reflect  E2  clearance  through  metabolism.  Addition¬ 
ally,  cells  in  culture  were  exposed  to  estrogenic  phenol  red 
prior  to  removal  to  phenol  red-free  stripped  medium,  and  the 
clearance  rate  of  phenol  red  in  these  cells.  Several  reports  have 
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previously  shown  that  the  ligand-bound  ER  is  involved  in  re¬ 
ciprocal  regulation  with  NF-kB  (reference  17  and  references 
therein).  Both  ERct  and  the  ERp  can  inhibit  NF-KB-depen- 
dent  transcription  in  a  manner  involving  the  ligand  binding 
domain  of  the  receptor  but  not  the  squelching  of  coactivator 
proteins  (11).  Discreet  parts  of  the  ER  ligand  binding  domain 
surface  are  required  for  this  transrepression  of  NF-kB  activity, 
which  are  separate  from  those  involved  in  transactivation  (53). 
Estrogen  has  been  clearly  demonstrated  to  repress  NF-kB 
activity,  and  correspondingly,  NF-kB  induction  follows  E2 
withdrawal.  Estrogen  removal  over  the  time  course  of  our 
experiments  in  vitro  is  only  cytostatic  for  MCF-7  cells.  Given 
this  and  the  fact  that  NF-kB  activation  is  reversed  by  reintro¬ 
duction  of  E2  in  vitro,  it  is  unlikely  that  the  induction  of  NF-kB 
DNA  binding  seen  after  removal  of  E2  in  vivo  is  the  result  of 
selection  for  cells  with  higher  basal  levels  of  NF-kB.  Moreover, 
NF-kB  is  a  positive  regulator  of  cell  growth  (12),  thus  arguing 
against  subpopulations  of  MCF-7  cells  containing  constitu- 
tively  elevated  NF-kB  activity. 

Since  the  predominant  NF-kB  complex  in  MCF-7  and  LCCl 
cells  is  either  a  homodimeric  complex  of  p50  or  a  heteromeric 
complex  of  p50  and  p52,  activation  after  hormone  removal 
would  not  be  expected  to  yield  a  strong  transcriptional  re¬ 
sponse.  Our  finding  that  the  level  of  the  p50-coactivating  pro¬ 
tein  Bcl-3  is  increased  after  E2  removal  provides  a  mechanism 
by  which  E2  withdrawal  induces  p50-associated  NF-kB  activity 
in  breast  cancer  cells  that  do  not  contain  high  levels  of  the 
alternate  transcriptionally  active  partner,  p65.  The  observation 
that  Bcl-3  increases  the  E2-dependent  growth  of  MCF-7  cells 
may  be  due  in  part  to  the  reported  ability  of  Bcl-3/p50/p52 
complexes  to  induce  cyclin  D1  (23,  55)  as  well  as  stimulate 
AP-l-mediated  transactivation  and  cellular  proliferation  (35). 
Although  the  Bcl-3  regulatory  regions  contain  NF-kB  en¬ 
hancer  elements  (40)  that  are  responsive  to  NF-kB  (8, 10),  this 
mechanism  cannot  account  for  the  rapid  increase  in  Bcl-3 
protein  expression  observed  both  in  vitro  and  in  vivo  after  E2 
withdrawal,  suggesting  that  Bcl-3  expression  may  be  more  di¬ 
rectly  under  control  of  the  ER. 

The  present  data  indicate  that  p50  and  p65  are  differentially 
expressed  during  progression  and  that  early  in  this  process 
NF-kB  activity  increases  and  is  composed  primarily  of  p50/p50 
(and  possibly  p50/p52),  along  with  a  smaller  increase  in  p50/ 
p65.  Highly  malignant  ER"  cells  have  both  p65/p50  and  p50 
(p50/p52)-associated  NF-kB  DNA  binding  activity.  Evidence 
suggests  that  the  p65  protein  plays  a  critical  role  in  tumorigen- 
esis  (22).  Nuclear  extracts  from  ER"^  MCF-7  cells  of  limited 
passage  express  only  trace  levels  of  p65  and  comparatively  low 
levels  of  p50.  LCCl  nuclei  contain  slightly  elevated  p65,  al¬ 
though  the  level  remains  significantly  lower  than  that  in  the 
ER"  cell  lines.  However,  LCCl  p50  nuclear  expression  is  es¬ 
sentially  similar  to  that  in  the  E2-independent  breast  cancer 
cell  lines.  Together  with  the  finding  that  basal  BcI-3  expression 
in  LCCl  cells  is  also  higher  than  that  in  the  parental  MCF-7 
isolate,  these  results  can  account  for  the  higher  constitutive 
level  of  both  NF-kB  complexes  in  LCCl  cells.  The  expression 
pattern  of  p65  and  p50/p52  in  breast  tumors  is  unclear.  It  has 
been  suggested  that  breast  cancer  cell  lines  but  not  tumors 
express  p65.  While  Cogswell  et  al.  (16)  detected  only  low  levels 
of  p65  protein  in  a  panel  of  four  breast  tumor  samples,  only 
one  of  these  was  ER".  Supershift  analysis  was  done  on  an  ER"^ 


tumor  which,  in  agreement  with  our  results,  showed  primarily 
p50-associated  NF-kB  DNA  binding  activity.  Sovak  et  al.  (50) 
detected  nuclear  p65  in  15  of  23  tumor  samples,  although  that 
study  did  not  differentiate  on  the  basis  of  ER  status.  Thus,  it 
appears  that  NF-kB  proteins  and  complexes  are  differentially 
present  in  breast  cancer  cells,  which,  at  least  in  cell  lines, 
correlates  with  both  ER  expression  and  the  stage  of  progres¬ 
sion.  One  of  the  most  important  outcomes  of  this  study  is  that 
it  describes  an  underlying  reason  for  why  primary  breast  tu¬ 
mors  with  functional  ERs  may  be  marginally  or  not  at  all  E2 
dependent.  With  respect  to  both  Bcl-3  expression  and  NF-kB 
activity,  like  LCCl  cells  and  MDA-MB-231  cells,  tumors  that 
display  elevated  Bcl-3  could  be  either  ER"^  or  ER".  Another 
variable  that  could  affect  these  parameters  in  tumors  that  are 
either  truly  E2  dependent  or  just  E2  responsive  is  the  level  of 
circulating  E2  in  a  patient  at  the  time  of  biopsy.  Based  on  our 
results,  Bcl-3  and  NF-kB  activity  would  not  be  predicted  to 
correlate  simply  with  ER  status  in  the  absence  of  any  other 
information  about  the  tumor  or  patient.  Instead,  ER  expres¬ 
sion  combined  with  Bcl-3  and  differential  NF-kB  activity  might 
predict  the  actual  hormone-dependent  status  of  the  tumor  and 
prove  to  be  a  useful  marker  for  progression. 

The  targets  of  NF-kB  are  numerous  and  varied,  ranging 
from  cytokines  in  the  immune  system  to  cell  adhesion  mole¬ 
cules,  transcription  factors,  a  variety  of  enzymes,  and  certain 
survival  proteins  (42).  The  last  group  includes  the  antiapo- 
ptotic  protein  Bcl-xL  as  well  as  Bcl-2,  whose  expression  has 
been  shown  to  be  activated  by  NF-kB  in  neuronal  cells  (10, 51). 
We  have  previously  demonstrated  that  Bcl-2  is  positively  reg¬ 
ulated  by  E2  in  MCF-7  cells  (52),  although  this  regulation 
appears  to  be  absent  in  LCCl  cells  and  basal  levels  of  Bcl-2  are 
lower  than  in  related  MCF-7  cells.  We  have  also  shown  that 
Bcl-2  expression  is  sufficient  to  prevent  E2  withdrawal  regres¬ 
sion  of  MCF-7  tumors  in  nude  mice  (44).  Thus,  a  further 
decrease  in  expression  of  Bcl-2  in  LCCl(lKBa®^)  cells  would 
likely  contribute  to  the  acquired  E2  sensitivity  of  LCCl 
(iKBa^^)  tumors  by  facilitating  the  apoptotic  response  follow¬ 
ing  the  death  signal  constituted  by  E2  release  pellet  removal. 
The  fact  that  Bcl-2  but  not  Bcl-x  was  reduced  by  IkBol^^ 
expression  suggests  that  p65/p50  complexes  may  positively  reg¬ 
ulate  Bcl-2  expression  while  Bcl-x  is  regulated  either  by  differ¬ 
ent  NF-kB  complexes  or  by  other  means  in  these  cells.  The 
lAP  protein  hi  API  (18)  has  been  shown  to  be  induced  by 
NF-kB  (reference  42  and  references  therein),  and  while  our 
preliminary  findings  indicate  higher  levels  of  both  hlAPl  and 
hIAP2  in  those  breast  cancer  cells  containing  elevated  consti¬ 
tutive  NF-kB  activity  (data  not  shown),  the  role  of  these  pro¬ 
teins  in  hormone  independence  requires  further  investigation. 

The  work  presented  here  demonstrates  that  NF-kB  com¬ 
plexes  play  distinct  roles  in  E2-dependent  and  -independent 
growth  and  survival  of  breast  cancer  cells.  Complexes  of  p50 
(and  possibly  p52)  and  Bcl-3  in  LCCl  cells  promote  the  growth 
of  LCCl  cells  in  an  E2-dependent  manner  and,  although  this 
was  not  directly  tested  here,  likely  contribute  to  growth  pro¬ 
motion  in  the  presence  of  an  appropriate  survival  signal  in 
E2-independent  cells  as  well.  The  majority  of  NF-kB  DNA 
binding  activity  is  composed  of  p50  dimers;  however,  expres¬ 
sion  of  Bcl-3,  which  is  required  for  the  transcriptional  activity 
of  these  complexes,  is  insufficient  to  confer  E2  independence 
to  MCF-7  cells.  Thus,  p65-containing  complexes  in  LCCl  cells, 
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FIG.  9.  Proposed  model  of  induction  of  NF-KB-mediated  estrogen- 
independent  growth  and  survival  following  estrogen  withdrawal  from 
ER"^  breast  cancer  cells.  E2  provides  growth  and  survival  signals  to 
ER"^  breast  cancer  cells.  E2  withdrawal  induces  NF-kB  activity  which 
in  most  cells  is  not  sufficient  to  support  E2-independent  growth,  and 
the  tumor  regresses.  However,  some  cells  could  persist  as  E2-indepen- 
dent  variants  as  a  result  of  adequate  expression  of  differential  NF-kB 
activity.  ER“  breast  cancer  cells  have  constitutive  p65  and  p50  activity. 
See  text  for  details. 


either  alone  or  in  concert  with  p50  dimer/Bcl-3  complexes, 
clearly  have  a  critical  role  in  conferring  the  E2-independent 
phenotype.  Supporting  this  hypothesis  is  the  observation  that 
expression  of  protein  clearly  reduced  nuclear  p65  lev¬ 

els  in  LCCl  tumors  but  had  little  effect  on  p50  or  p52.  More¬ 
over  p65  complexes  were  virtually  undetectable  by  supershift 
analysis  in  LCCl(lKBa^^)  cells.  The  crystal  structure  of  the 
lKBa:NF-KB  complex  has  been  resolved,  revealing  a  stable  and 
extensive  interaction  between  IkBcx  and  p65.  The  outcome  of 
this  interaction  is  a  critical  change  in  the  conformation  of  p65, 
resulting  in  allosteric  inhibition  of  DNA  binding  as  well  as 
masking  of  the  nuclear  localization  signal  in  p65  (26).  Al¬ 
though  iKBa  can  bind  p50  homodimers,  it  does  so  with  a 
60-fold-lower  affinity  than  in  its  interaction  with  p65/p50  (30). 
Thus,  the  data  presented  here,  as  well  as  the  fact  that  all  ER~ 
or  E2-independent  cell  lines  have  high  levels  of  p65/p50  NF-kB 
activity,  support  the  hypothesis  that  breast  cancer  cell  progres¬ 
sion  requires  increased  p65/p50  activity. 

Our  results  together  with  those  in  the  literature  suggest  the 
model  in  Fig.  9,  which  indicates  pathways  of  growth  and  sur¬ 
vival  in  ER"^  breast  cancer  cells  in  both  the  presence  and 
absence  of  E2.  E2  provides  dual  growth  and  survival  signals  at 
least  in  part  by  increasing  cyclin  D1  and  BcI-2  protein  levels. 
We  surmise  that  the  increase  in  Bcl-3  after  E2  withdrawal 
could  also  contribute  to  increased  proliferative  rates,  based  on 
the  increased  rate  of  tumor  growth  generated  by  Bcl-3-over- 
expressing  cells.  This  increased  proliferation  is  insufficient, 
however,  to  prevent  the  overall  regression  of  the  tumor  over 
time.  Thus,  our  model  suggests  that  while  most  E2-dependent 
cells  will  undergo  cell  death  after  E2  withdrawal  in  vivo,  a 
subset  with  adequate  levels  of  p50/p52/Bcl-3  and  p50/p65  will 
both  grow  and  survive,  thus  establishing  the  E2-independent 
phenotype.  It  is  possible  that  these  two  events  may  not  occur 
simultaneously  in  tumor  cells  and  that  the  immediate  induction 
of  Bcl-3  after  E2  withdrawal  may  be  sufficient  over  the  short 
term  to  maintain  the  tumor  while  the  proliferative  rate  exceeds 
the  rate  of  apoptosis.  This  could  provide  adequate  time  for 
other  signaling  pathways,  such  as  ErbB2,  which  is  also  induced 


after  E2  withdrawal  in  MCF-7  cells  (data  not  shown),  to  induce 
p50/p65-associated  activity  (6,  57).  Studies  to  investigate  the 
possible  role  of  ErbB  receptors  in  this  process  are  under  way. 

The  ability  of  iKBa  to  inhibit  E2-independent  growth  in  this 
study  shows  that  NF-kB  activation  may  be  the  most  critical 
early  event  following  E2  withdrawal  resulting  in  breast  cancer 
progression.  Therefore,  in  addition  to  representing  a  potential 
therapeutic  target  in  ER"  breast  cancer  as  proposed  by  Biswas 
et  al.  (7),  NF-kB  inhibition  may  well  prove  to  be  effective  in 
obviating  progression.  Ironically,  a  reduction  in  circulating  E2 
levels  (or  spontaneous  loss  of  ER  expression)  in  individuals 
with  hormone-dependent  breast  cancer  would  directly  precip¬ 
itate  the  early  induction  of  NF-kB  activity,  which  could  confer 
E2-independent  growth  and  survival  to  these  cells. 
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Abstract 

Recent  advances  in  machine  learning  and  pattern  recognition  methods  provide  new  analytical  tools  to  explore  high 
dimensional  gene  expression  microarray  data.  Our  data  mining  software,  VISual  Data  Analysis  for  cluster  discovery 
(VISDA),  reveals  many  distinguishing  patterns  among  gene  expression  proJes,  which  are  responsible  for  the  cell's 
phenotypes.  The  mo  del -supported  exploration  of  high-dimensional  data  space  is  achieved  through  two  complementary 
schemes:  dimensionality  reduction  by  discriminatory  data  projection  and  cluster  decomposition  by  soft  data  clustering. 
Reducing  dimensionality  generates  the  visualization  of  the  complete  data  set  at  the  top  level.  This  data  set  is  then 
partitioned  into  subclusters  that  can  consequently  be  visualized  at  lower  levels  and  if  necessary  partitioned  again.  In 
this  paper,  three  di^erent  algorithms  are  evaluated  in  their  abilities  to  reduce  dimensionality  and  to  visualize  data  sets: 
Principal  Component  Analysis  (PCA),  Discriminatory  Component  Analysis  (DCA),  and  Projection  Pursuit  Method 
(PPM).  The  partitioning  into  subclusters  uses  the  Expectation-Maximization  (EM)  algorithm  and  the  hierarchical 
normal  mixture  model  that  Is  selected  by  the  user  and  veri...ed  "optimally''  by  the  minimum  description  length  criterion. 
These  approaches  produce  di^erent  visualizations  that  are  compared  against  known  phenotypes  from  the  microarray 
experiments.  Overall,  these  algorithms  and  user-selected  models  explore  the  high  dimensional  data  where  standard 
analyses  may  not  be  su Cadent. 
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I.  Introduction 

With  gene  expression  microarrays,  the  relative  expression  levels  in  two  or  more  mRNA  populations 
derived  from  tissue  samples  can  be  assayed  for  thousands  of  known  sequenced  genes  simultaneously 
[1],  [2].  Thus,  this  assay  makes  them  an  eCDcient  and  cost-elective  tool  for  large  scale  analysis  of  gene 
expression  in  recent  years.  Their  design  is  composed  of  a  platform  (glass  slide,  nylon  ...Iter,  or  chip)  with 
bounded  cDNA  fragments  or  oligonucleotides  that  code  for  either  known  gene  sequences  or  Expressed 
Sequence  Tags  (ESTs).  Microarrays  have  been  used  to  classify  clinical  samples,  to  investigate  the 
mechanism  of  drug  action,  to  examine  the  ejects  of  drugs  on  gene  expression  in  yeasts,  and  to  identify 
and  validate  novel  therapeutics  for  cancer  patients  [1][3][4].  Still  hidden  links  remain  between  genes 
and  the  biology  of  cancer.  They,  however,  may  be  revealed  through  a  large  scale  of  gene  expression 
analysis  of  normal  and  cancer  cells.  Speci...cally,  an  altered  gene  expression  pattern  in  the  malignant 
tissues  can  determine  their  phenotypes,  e.g.,  drug  responses,  growth  proliferation  rate,  angiogenesis, 
and  metastases.  Microarrays  allow  mass  measurements  of  gene  expression  to  occur,  but  the  tools 
to  analyze  the  data  are  not  well  developed  [5].  Because  the  number  of  dimensions  in  a  microarray 
data  set  could  reach  from  hundreds  to  tens  of  thousands,  the  development  of  these  analytical  tools  is 
crucial. 

Advances  in  microarray  technologies  have  enabled  investigators  to  explore  the  dynamics  of  tran¬ 
scription  on  a  molecular  scale.  The  current  challenge  is  to  extract  useful  and  reliable  information  out 
of  these  large  data  sets.  A  common  and  ...rst  approach  is  cluster  analysis.  The  primary  objective  of 
cluster  analysis  is  to  group  genes  that  have  comparable  patterns  of  variation.  This  approach  is  valu¬ 
able  for  reducing  the  complexity  of  large  data  sets  and  for  identifying  predominant  patterns  within 
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the  data.  However,  additional  methods  are  needed  to  extract  information  about  individual  genes  from 
these  large  data  sets. 

The  gene  expression  pro,..le  (mRNA),  one  of  the  molecular  signatures  (DNA,  mRNA,  and  protein), 
is  a  snapshot  of  the  malignant  and  proliferative  mechanism  behind  cancer.  The  representation  of  each 
sample's  pro...le  is  described  as  a  point  in  a  d-dimensional  gene  expression  space  in  which  each  axis 
represents  the  expression  level  of  one  gene.  The  presence  of  well-separated  sample  groups  implies  that 
the  representations  of  samples  within  the  same  group  are  close  to  each  other  in  this  gene  expression 
space  but  distant  from  those  of  other  samples.  Thus,  the  representations  of  phenotype-related  samples 
form  clusters. 

The  research  plan  is  divided  into  three  major  steps:  cluster  discovery,  gene  selection,  and  phenotype 
prediction.  Cluster  discovery  detects  previously  unrecognized  tumor  subtypes  [5].  Gene  selection 
identi...es  the  most  relevant  gene  subset  involving  the  biological  process  that  generates  the  patterns. 
Phenotype  prediction  assigns  unknown  tumor  sample  to  known  tumor  classes  [5].  The  main  challenge, 
however,  is  that  the  microarray  data  is  high-dimensional,  multi-modal,  and  lacking  in  prior  knowledge. 

Data  clustering  is  a  process  of  grouping  input  data  points  with  similar  features  in  the  multidimen¬ 
sional  space;  the  algorithms  are  being  investigated  for  long  time.  The  most  common  hierarchical 
clustering  method  often  used  by  biologists  for  data  clustering  is  dendrogram  [6].  At  the  end  of  the 
analysis  the  data  points  are  arranged  into  a  phylogenetic  tree,  the  level  of  similarity  of  two  pairs  is 
represented  by  the  length  of  the  branch.  However,  even  though  the  hierarchical  clustering  is  simple 
and  straight  forward,  it  is  designed  to  retect  true  hierarchical  tree  structure  and  that  is  not  the  way  in 
which  microarray  data  is  generated.  It  is  very  important  to  include  more  biological  information  rather 
than  rigidly  clustering  data  points.  Hierarchical  clustering  may  fail  to  group  data  points  in  the  right 
way  because  it  is  greatly  inluenced  by  local  condition  and  has  no  opportunity  of  evaluating  the  global 
structure.  Support  Vector  Machines  (SOM)  attempts  to  search  for  relevant  patterns  by  ...rst  imposing 
structure  on  the  data  with  nodes  that  are  expected  to  eventually  move  to  the  center  of  each  cluster. 
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aod  then  updating  the  structure  map  in  each  iteration  based  on  a  data  point  randomly  selected  from 
the  data  set  [7].  The  result  ends  up  gathering  similar  samples  in  the  same  cluster.  Studies  of  using 
SOM  to  cluster  genes  have  been  done  by  Whitehead  Institute/MIT  [7].  Under  unsupervised  situation, 
the  success  of  this  approach  partially  depends  on  the  initialization  of  the  map  structure,  e.g.  number 
of  nodes  and  dioerent  geometries.  Without  data  modelling,  SOM  lacks  criteria  for  validation  of  cluster 
structure,  e.g.  whether  the  number  of  clusters  is  optimal. 

A  gene  clustering  method  based  on  graph  theoretic  techniques  is  developed  for  the  situation  that 
the  clusters  are  not  assumed  to  be  hierarchically  structured  [8].  The  cluster  information  is  mapped  to 
an  undirected  graph  where  each  clique  in  the  graph  indicates  a  cluster.  It  is  assumed  in  the  model  that 
the  input  data  contain  underlying  cluster  structure  contaminated  by  random  errors.  Through  applying 
this  clustering  algorithm,  with  high  probability,  the  cluster  structure  can  be  recovered  by  removing 
the  random  errors  from  the  input  data.  However,  the  algorithm  is  developed  for  gene  clustering,  in 
which  the  input  data  have  much  lower  dimensionality  than  the  microarray  data  we  have  (about  20  vs. 
500  »  8000),  its  capability  of  handling  high  dimensional  data  is  uncertain.  In  addition,  the  microarray 
data  have  signi...cantly  large  overlaps  among  clusters  resulting  from  the  nature  of  biological  data.  The 
potential  application  of  this  algorithm  to  microarray  data  is  not  optimistic  since  it  may  not  be  able  to 
enectively  cluster  data  points  with  overlaps. 

An  interesting  clustering  approach  using  support  vector  concept  is  presented  in  [9],  where  data  points 
are  mapped  to  a  high  dimensional  feature  space  and  support  vectors  are  used  to  de...ne  a  sphere  to 
enclose  them.  Data  points  are  clustered  hierarchically  by  ac^usting  parameters  in  the  kernel  function 
that  mathematically  represents  the  mapping  from  input  space  to  feature  space;  the  outliers  are  allowed 
by  setting  appropriate  penalty  parameter.  The  method  has  advantages:  ...nd  clusters  with  arbitrary 
shapes,  no  need  for  dimensionality  reduction  and  capability  of  dealing  with  outliers.  However,  the 
clusters  with  sizable  overlaps  cannot  be  correctly  found  by  using  this  method,  therefore  it  is  not  very 
suitable  for  microarray  data  clustering. 
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.We  propose  a  model-based  approach  hierarchical  data  exploration  method  that  may  greatly  overcome 
the  limitations  of  other  methods.  For  instance,  our  method  can  evaluate  the  overall  cluster  structure 
while  hierarchical  clustering  method  and  SOM  can  only  do  clustering  blindly  by  starting  without 
any  idea  about  the  large  scope  structure.  The  hierarchical  visualization  scheme  help  discover  any 
hierarchical  tree  structure  if  existing,  but  it  is  still  valid  if  such  structure  do  not  exist  since  the 
method  is  also  designed  for  visualizing  the  inner  structure  of  any  cluster.  The  initialization  of  the 
clustering  is  also  supported  by  user  interaction  and  veri...ed  by  model  selection  criterion  to  ...nd  the 
best  structure  description.  The  soft  data  decomposition  allows  the  overlaps  among  clusters  are  well 
respected  and  modeled.  The  model  selection  procedure  provide  a  theoretical  and  quantitative  tool  for 
cluster  validation. 

In  this  paper,  we  will  report  the  progress  in  cluster  discovery  with  our  newly  developed  discrimina¬ 
tory  data  mining  methods  [16] [17].  The  presentation  will  entail  three  major  components:  (1)  statistical 
modeling  of  gene  expression  microarray  data  with  a  standard  ...nite  normal  mixture  (SFNM)  distribu¬ 
tion;  (2)  development  of  a  Joint  supervised  and  unsupervised  data  mining  scheme  to  "discover"  sample 
clusters  in  a  discriminative  visual  pyramid;  and  (3)  evaluation  of  the  data  clusters  produced  by  such 
scheme  with  phenotype- known  microarray  experiments.  M^or  dinerences  are  found  between  our  work 
and  the  previous  most  related  research  [18],  [26],  [22],  [23],  [24][25].  First,  since  the  high  complexity 
of  the  data  structure  within  a  high  dimensional  space  cannot  be  adequately  explored  by  a  single-level 
visualization  [18],  we  developed  a  hierarchical  visualization  paradigm,  involving  mixture  statistical 
sub-models  and  visualization  subspaces.  The  resulting  data  mining  tool  is  capable  of  capturing  cluster 
distribution  structure  in  high  dimensional  space  and  discover  the  relationships  among  clusters.  Second, 
we  proposed  three  algorithms:  1)  Discriminatory  Component  Analysis  (DCA),  2)  combined  Projec¬ 
tion  Pursuit  Method  (PPM)  /  Independent  Component  Analysis  (ICA)  -  ICA/PPM,  and  3)  combined 
PPM  /  Principal  Component  Analysis  (PCA)  -  PCA/PPM.  All  three  probabilistically  project  the 
softly  partitioned  data  set  onto  multiple  visual  subspaces.  They  allow  an  elective  separation  of  local 
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clysters  in  dimensionally  reduced  visual  subspaces,  which  may  represent  the  original  data  set  well. 
Third,  we  implemented  a  probabilistic  adaptive  principal  components  extraction  (PAPEX)  algorithm 
to  estimate  the  top  two  principal  axes  and  an  incremental  expectation-maximization  (lEM)  procedure 
to  estimate  SFNM  distribution.  The  computation  is  eCcient  when  confronted  with  high  dimensional 
data  sets  [14].  Finally,  we  imposed  a  model  selection  procedure  to  determine  the  number  of  sub-clusters 
within  each  cluster  using  the  Minimum  Description  Length  (MDL)  criterion.  In  addition,  applying 
the  MDL  criterion  also  determines  whether  a  further  split  or  partition  of  a  subspace  should  continue 
in  completing  the  whole  hierarchy  [16],  [24]. 

II.  Theory 

A.  Hierarchical  Visual  Data  Exploration  Scheme 

The  purpose  of  cluster  analysis  is  to  determine  whether  there  are  certain  number  of  well-de...ned 
data  sets  within  the  entire  data  distribution  and/or  derive  most  rational  and  optimal  grouping  scheme 
to  partition  data  into  a  speci...ed  number  of  clusters. 

Since  a  gene  expression  microarray  data  set  is  a  mixture  of  samples  of  cancer  and  non-cancer,  or 
a  mixture  of  samples  of  various  types  of  cancers,  the  SFNM  model  may  be  the  best  approach  for 
describing  such  multi-modal  data  structure  [12].  In  the  case  that  k  clusters  exist  in  the  data  set,  a 
mixture  model  with  k  normal  distributions  can  be  used  to  describe  the  overall  distribution  of  the  data. 
We  will  also  estimate  the  density  parameters  of  each  cluster  and  the  overall  mixture. 

Assume  the  samplepoints  ftig  in  gene  expression  space  form  Kq  clusters  fC  •••>  (\k)^tk),  •••, 

C  tKp,CtKb)g>  where  and  Ctk  are  the  mean  vector  and  covariance  matrix  of  cluster  k  respectively. 
Using  the  SFNM  to  model  the  multi-modal  distribution  is  considerably  successful  recently  [12].  Such 
a  data  distribution  takes  a  sum  of  the  following  general  form 

p(t)  =  7rk^(tj\k.Ctk)  (1) 

*=1 

P 

where  TTk  is  the  corresponding  mixing  proportion,  with  0  •  TTfc  •  1  and  TTfe  =  1,  and  g  is  the 
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Qaussian  kernel.  The  modeling  of  SFNM  on  the  microarray  data  addresses  a  combination  of  the 
detection  of  structural  parameter  Ko  (e.g.,  cluster  discovery)  and  the  estimation  of  regional  parameters 
(TTfc,  tkJ^tk)  based  on  the  observations  t.  One  natural  criterion  used  for  this  modeling  is  the  Maximum 
Likelihood  (ML)  estimation  using  the  Expectation-Maximization  (EM)  algorithm  [12]. 

T he  super  high  dimensionality  (500  v  8000)  of  microarray  data  introduce  dittculties  in  the  revelation 
of  data  structure,  which  have  been  well  studied  in  [16].  Cover's  theorem  on  the  separability  of  patterns 
tells  us  that  one  single  projection  on  a  dimension  reduced  space  is  not  suCcient  for  revelation  of  the 
true  data  structure.  Hierarchical  visual  exploration  paradigm,  involving  hierarchical  statistical  models 
and  visualization  spaces/subspaces,  may  provide  more  opportunities  for  the  user  to  understand  the 
data  distribution  structure,  and  are  essential  for  high  dimensional  microarray  data  study.  \Ne  believe 
that  the  consideration  of  introducing  user  interaction  into  the  clustering  algorithm  is  a  more  practical 
approach,  which  greatly  reduces  both  computational  complexity  and  local  optimum  likelihood  [16][11]. 
A  user-friendly  graphical  interface  for  data  visualization  purpose  is  developed  to  allow  the  user  to  select 
initial  centers  of  the  data  clusters.  To  visualize  data,  we  further  developed  data  projection  methods 
based  on  the  current  methods  used  in  [16]  in  order  to  maximize  the  revelation  of  cluster  structure. 
The  details  about  the  various  visualization  techniques  will  be  introduced  in  the  next  sub-section. 

In  this  approach,  the  techniques  involved  are:  statistical  modeling  of  microarray  data  with  SFNM 
distribution,  discriminative  data  projections  Jointly  presented  by  supervised  and  unsupervised  learn¬ 
ing  processes,  soft  cluster  decomposition  based  on  an  incremental  expectation-maximization  (I  EM) 
procedure,  and  evaluation  of  the  data  clusters  with  phenotype- known  microarray  experiments. 

The  hierarchical  version  of  the  SFNM  model  can  be  extended  to  include  more  levels  based  on  the 
same  principle  as  above.  The  more  hierarchical  levels  the  tree  has,  the  more  sub-models  are  used, 
and  the  ...ner  the  sub-models  are.  The  formation  of  the  hierarchical  visualization  tree  is  guided  and 
veri...ed  by  model  selection  over  x-subspaces.  Model  selection  refers  to  the  detection  of  the  structural 
parameter  K  (the  number  of  clusters  or  sub-clusters).  In  addition  to  the  user's  visual  inspection,  we 
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pcopose  to  use  an  information  theoretic  criterion,  i.e.,  the  Minimum  Description  Length  (MDL)  [27]. 
The  MDL  calculation  is  a  model  ...tting  procedure,  in  which  an  optimal  model  is  selected  such  that 
the  selected  model  best  ...ts  the  observed  data.  Thus,  the  value  of  K  is  selected  by  minimizing 

MDL(/rj  =  i  log(LML)  +  0.5K,  log  N  (2) 

where  Ka  is  the  number  of  free  acJjustable  parameters,  and  Lml  is  the  Joint  maximum  likelihood,  of 
the  model. 

B.  Discriminatory  Data  Projection 

The  purpose  of  developing  discriminatory  data  projection  tools  is  to  maximally  discover  hidden 
cluster  structure  in  the  data  space.  The  consideration  of  using  multiple  data  projection  tools  is 
primarily  based  on  the  fact  that  the  performance  of  the  individual  projection  scheme  tends  to  be  case- 
dependent  due  to  limited  number  of  data  samples  in  nearly  all  existing  microarray  data.  Therefore, 
it  is  insuCDcient  to  use  only  one  projection  tool,  which  may  increase  risk  of  losing  chances  to  discover 
cluster  structure.  The  four  discriminatory  projection  tools  presented  in  this  paper  are:  PCA,  DCA, 
PCA/PPM,  and  ICA/PPM.  The  details  of  each  method  are  discussed  in  the  following  sub-sections. 

B.1  PCA 

PCA  is  an  elective  unsupervised  method  for  achieving  dimensionality  reduction  [22],  [14],  [10].  For  a 
set  of  observed  rf-dimensional  data  vectors  ftig,  i  2  f1 , ...,  Ng,  the  q  principal  axes  w^,  m  2  f1,...,g(  • 
d)g,  are  those  orthogonal  axes  onto  which  the  retained  variances  under  projection  are  maximal.  It 
can  be  shown  that  the  principal  axes  Wm  are  given  by  the  q  dominant  eigenvectors  (i.e.,  q  maximal 
eigenvalues)  of  the  sample  covariance  matrix 

1  ^ 

C.-jq-  (ti  i  \)(t,  i 


(3) 
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such  that 

CtWm  =  AmWm  (4) 

where  is  the  sample  mean  and  Am  is  the  eigenvalue.  The  vector  Xj  =W^(ti  j  "'t),  where 
W=  (wi,W2,  ...,Wg),  is  thus  a  q  dimensional  new  representation  of  the  observed  vector  t*.  Two  is¬ 
sues  contribute  to  the  limitations  of  the  conventional  PCA:  its  global  linearity  without  incorporating 
data  structure;  and  its  optimality  based  on  reconstruction  error  rather  than  pattern  separability 

B.2  DC  A 

If  class  information  is  known,  the  search  of  directions  in  data  space  for  discovering  cluster  structure 
is  under  better  guidance.  There  are  two  types  of  class  information  we  may  be  able  to  obtain:  known 
phenotypes  from  biological  experimental  setting,  and  sub-cluster  information  resulting  from  cluster 
decomposition  based  on  an  unsupervised  projection  (PCA  or  PPM).  For  the  top  level  projection,  DCA 
is  a  supervised  process  by  using  the  known  phenotype(class)  information  in  the  search  of  projection. 
However,  DCA  can  also  be  used  in  an  unsupervised  situation  that  is  on  the  sub-levels  by  using  the 
second  type  of  class  information  discussed  above.  Demonstrations  of  di°erent  applications  of  DCA  are 
shown  in  the  Result  Section. 

When  confronting  a  multi-modal  data  set,  however,  is  to  emphasize  the  inter-cluster  separation 
by  replacing  the  total  covariance  matrix  with  the  Fisher's  scatter  matrix  [26],  [10],  i.e.,  to  ...nd  the 
eigenvectors  of  S;;J  Sb 

S;JSbWm  =  AmWm  (5) 

where  the  within-cluster  scatter  matrix  (Sw)  isthejoint  scatter  of  data  point  tj  around  the  conditional 
mean  vector  of  Kd  classes  (on  the  top  level)  or  sub-clusters  (on  the  sub- levels) 


JOURNAL  OF  SIGNAL  PROCESSING  SYSTEMS:  WANG,  ZHANG,  LU,  LEE,  KUNG,  CLARKE 


10 


w4th  cluster  conditioned  covariance  matrix 


!iiZik(tU  \k)(ti  i 


'tk 


i  =1  ^  Ik 


(7) 


where 

'^k5(^J  ^  tk)  ^tk)  /n\ 

z  ik  = - 7~: - , 

P(^i) 

and  the  between-ciuster  scatter  matrix  (Sb)  is  the  scatter  of  the  cluster  conditional  mean  vector 
around  the  overall  data  center 

s»=  i  ',)(’«  i  (9) 

k  =  1 

such  that  the  separability  of  patterns  is  maximized,  that  is 


W  =  arg  maxfTrace(WjSjSb  Wo)g.  (10) 

Wo 

This  is  termed  as  Discriminatory  Component  Analysis. 

The  original  vectors  ftjg  are  linearly  transformed  by  W,  a  d  £2  matrix,  through  x=W^(t  j  \) 
into  a  two-dimensional  projection  space  x=  {xi,X2)^.  For  a  normal  distribution  ^(tj  \|^,Ctk)  over  the 
data  space,  a  similar  dimensionally  reduced  probability  distribution  ^ki^tk)  of  the  new  variable 

X  in  the  projection  space  is  simply  de...ned  by  the  Radon  transform  of  5'(tj‘'  ^kjCtk) 

Z 

S(XJ'„.C,,)=  g(tj\„C„)±(x  i  W^t+W^’Jdt  (11) 


where  5(.)  is  the  delta  function  that  d(0)  =  1  and  5(=  0)  =  0.  According  to  the  linear  superposition 
property  of  Radon  transform  and  the  projection  invariant  property  of  normal  distribution,  we  have 

/(x)  =  7rfc^(xi‘'xk,Cxk)  (12) 

k=^ 

as  the  counterpart  of  Eq.  (1)  in  x-space  de...ned  by  projection  matrix  W. 

However,  when  the  data  set  is  projected  onto  a  single  lower  dimensional  subspace,  its  inherent  multi¬ 
modal  nature  may  be  partially  or  completely  obscured  according  to  Cover's  theorem  on  the  separability 
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of  patterns  [14].  In  other  words,  even  though  the  cluster  structure  of  a  data  set  may  be  evident  from 
the  higher  dimensional  space,  it  is  quite  conceivable  to  have  the  ...ner  cluster  patterns  concealed  after 
a  single  linear  projection,  leading  to  an  unidenti...able  correspondence  between  Eq.  (1)  and  Eq.  (12) 
[16],  A  novel  approach  is  to  model  high-dimensional  multi-modal  data  set  with  a  hierarchical  mixture 
model  and  accordingly  with  a  collection  of  probabilistic  principal  discriminatory  subspaces  [16],  [22], 
[23],  [24],  namely  the  exploratory  cluster  discovery. 

Assume  a  top-level  model  consisting  of  a  single  Radon  transform  W  and  a  mixture  of  K-iiK 
Ko)  normal  distributions  p(t)  =  7rfcg(tj  \|^,Ctk)  which  is  identLable  in  x-space,  i.e.,  /(x)  = 

p 

kl^  xkJ^xk).  we  can  form  a  two-level  hierarchy  by  associating  a  group  of  SFNM  sub-models 

with  each  model  k  at  top-level 

p(t)  =  ^^4jUkg(tj\(kj))Ct(kj))  03) 

k=i  j=i 

where  TTjjk  again  corresponds  to  a  set  of  mixing  proportions,  one  for  each  k,  with  0  •  Tr^jt  •  1 

P  P 

and  j'^jjk  =  1-  and  =  ^o-  To  reveal  the  hidden  cluster  pattern  within  each  model 

P 

k  at  top-level,  i.e.,  ^tkJ^tk)  -  j=t  t(kj))^t(kj))-  an  associated  probabilistic  principal 

discriminative  subspace  is  constructed  that  focuses  on  the  separability  of  patterns  within  the  data 
portion  de...ned  by  model  k,  where  the  opaque  degree  of  a  data  point  in  the  subspace  plot  is  proportional 
to  its  posterior  probability  belonging  to  this  model,  i.e.,  Zik  determined  at  top-level. 

The  further  cluster  discovery  is  a  two-stage  procedure:  a  soft  partitioning  of  each  model  k  into  K2,k 
sub-clusters  followed  by  a  construction  of  corresponding  subspace.  I  nstead  of  assigning  each  given  data 
point  exclusively  to  one  subspace,  the  contribution  to  its  generation  is  shared  among  all  the  subspaces. 
The  subspaces  for  the  sub-models  at  second-level  are  generated  by  the  probabilistic  DCA  such  that 

^k,Ab^k,rr^  =  (14) 

P/L 

where  Sk,w  =  ^jjfcCt(kj)  with  subcluster  conditioned  covariance  matrix 

P  IP 

^t(kj)  =  i=-\  ^i{k,j)i^i  I  ^t(kj))(^  I  ^t(kj))^/  i=1  ^i(kj)r 
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p 

and  Sfe,b  =  7rjjfe(\(kj)  j  ''tk)(\(kj)  i  \k)^-  The  probability  distribution  of  model  A:  in  x- 

space  at  second-level  is  now  de...ned  by  the  model  k  focused  Radon  transform  of  5'(tj‘'tk>Ctk)r  i.e., 
gi^''  xkjC^k)  =  **  ^tk)Q:k)^(x  i  Wjt-i-Wj''  tk)dt-  It  should  be  noted  that  each  component  in 

Eq.  (13)  now  corresponds  to  an  independent  sub-model  with  projection  matrix  Wfe.  To  interpret  the 
corresponding  set  of  visualization  subspaces,  all  data  points  Xik  =W[(ti  1  \k)  will  appear  in  every 
plot  of  the  K^  subspaces  at  the  second-level,  with  their  opaque  degree  proportional  to  Zik. 

B.3  PCA/PPM 

In  the  enort  of  searching  projections  for  cluster  separability,  we  take  an  alternative  unsupervised 
approach.  Projection  Pursuit  Method  (PPM)  that  is  to  search  "interesting"  projections.  Even  though  it 
is  not  universally  agreed  on  what  constitutes  an  "interesting"  projection  in  PPM  research  community, 
the  de...nition  of  "interesting"  projection  by  some  leading  researchers  does  meet  our  needs  in  this 
particular  project,  which  is  a  projection  where  the  data  separate  into  distinct  and  meaningful  clusters 
[20],  \Nq  have  two  particular  goals  of  using  PPM  in  this  project:  (1)  ...nd  low  dimensional  (equal  or  less 
than  three)  projections  that  provide  the  most  revealing  view  of  the  overall  data  distribution;  (2)  use 
PPM  for  dimensionality  reduction  so  that  we  will  focus  directly  on  the  discriminatory  projections  rather 
than  indirectly  searching  through  covariance.  The  advantage  of  PPM  is  that  it  ...nds  the  directions  that 
are  not  anected  by  the  linear  scale  and  correlational  structure  of  the  data,  which  is  the  disadvantage 
of  PCA. 

As  mentioned  before  if  we  want  to  put  human  ability  for  pattern  discovery  to  good  use  of  four  or 
higher  dimensional  data,  we  shall  ...rst  look  at  the  projections  onto  the  spaces  spanned  by  two  or  three 
of  the  dimensions.  In  most  cases,  any  arbitrary  direction  could  be  the  right  one  for  cluster  structure 
discovery.  It  implies  that  we  have  to  search  the  space  in  all  possible  directions,  thus  the  problem  gets 
even  worse  due  to  exhaustive  searching.  In  our  approach,  we  tried  to  simplify  the  PPM  by  using  non- 
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Gaussianity  as  a  criterion,  and  using  PCA  and  Independent  Component  Analysis  (ICA)  as  vehicles 
for  „.nding  discriminatory  projections. 

The  direction  with  Gaussian  or  super-Gaussian  distribution  is  the  one  consisting  least  data  struc¬ 
tural  information;  on  the  other  hand,  the  least  Gaussian  distribution  indicates  plentiful  structural 
information.  If  the  data  distribute  as  one  Gaussian  or  super-Gaussian  distribution  (a  "spiky"  proba¬ 
bility  density  function  (pdf)  with  long  and  heavy  tails)  in  a  direction,  it  implies  that  the  data  points 
are  most  likely  forming  a  one-cluster  structure  instead  of  multi-modal  structure  that  is  the  important 
for  cluster  separation.  On  the  contrary,  the  data  may  construct  two  or  more  clusters  in  the  directions 
where  distributions  are  non-Gaussian,  e,g.  sub-Gaussian  (a  "lat"  pdf  and  more  like  a  uniform  distri¬ 
bution).  \l\k  used  one  of  the  non-Gaussianity  measures  kurtosis,  de...ned  as  j  3  {E  fy^g)^,  where 

y  denotes  the  random  variable.  Kurtosis  can  be  either  positive  or  negative.  A  random  variable  with  a 
positive  kurtosis  is  considered  as  a  super-Gaussian,  and  that  with  a  negative  kurtosis  is  considered  as 
a  sub-Gaussian  [21][15]. 

We  try  to  fully  utilize  our  PCA  results  to  test  PPM  in  order  to  reduce  computational  load.  A 
prototype  computer  algorithm,  termed  as  PCA/PPM,  is  implemented  to  calculate  kurtosis  of  each 
principal  component  resulting  from  PCA  and  rank  them  so  that  the  optimal  directions  are  found  to 
be  those  that  show  the  strongest  sub-Gaussian. 

B.4  ICA/PPM 

Independent  component  analysis  (ICA)  is  a  recently  developed  method  for  ...nding  linear  representa¬ 
tions  of  non-Gaussian  data  such  that  the  components  are  statistically  independent,  or  as  independent 
as  possible.  Since  PPM  is  designed  to  search  the  directions  with  the  least  Gaussian  distribution  and  the 
least  Gaussian  distribution  is  the  criterion  to  estimate  ICA  model,  ICA  and  PPM  are  closely  related. 
The  non-Gaussianity  measures  can  be  adopted  as  projection  pursuit  functional  indices.  ICA/PPM 
algorithm  that  is  PPM  assisted  with  ICA,  is  then  implemented  to  directly  search  for  directions  with 
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nen-Gaussian  distribution  in  data  space  [10]  [21],  Similar  to  PCA/PPM,  kurtosis  is  calculated  on  each 
independent  component  resulting  from  1C  A,  and  two  components  with  the  most  negative  kurtosis  are 
chosen  for  2-D  projection. 


III.  Algorithm 


We  now  present  the  description  of  our  algorithms  that  progressively  proceeds  by  ...tting  a  series  of 
sub-models  to  the  clusters  of  the  data  set  interactively  and  incrementally. 

The  algorithm  begins  by  determining  W  for  the  top-level  projection  (a  two-dimensional  x-space). 
The  initial  estimate  of  W  is  obtained  from  our  previously  developed  APEX  neural  computation  (e.g., 
Eq.  (4))  [14],  and  further  modi.,.ed  by  DCA-APEX  algorithm  (e.g.,  Eq.  (5)  with  or  without  SJ) 
[16]  where  the  prerequisite  is  to  estimate  the  SFNM  model  at  top-level  (e.g.,  Eq.  (1)).  To  remedy 
the  problem  of  high  dimensionality  with  microarray  data,  neural  computation  of  (W,V4k,  ^tioCtk)  is 
e<tcient,  in  which  only  the  top  two  eigenvectors  of  the  covariance  or  scatter  matrix  are  calculated,  and 
model  parameter  values  are  ...rst  estimated  in  x-space  and  then  ...ne  tuned  in  t-space  incrementally. 
For  example,  the  Incremental  EM  (I  EM)  procedure  provides  "soft"  splits  of  the  data  points,  hence 
allowing  the  data  to  contribute  simultaneously  to  multiple  clusters,  which  results  in 
E-Step 


Z(i+1)k 


(15) 


M-Step 
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for  k  =  ^, K-[,  where  a{i)  and  6(i)  are  introduced  as  the  learning  rates,  two  sequences  converging  to 
zero,  ensuring  unbiased  estimates  after  convergence.  The  user  will  pin-point  the  initial  cluster  centers 
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and  assign  =  ^|K^  and  =W^CtW.  The  optimum  value  of is  determined  based  on 
MDL  (e.g.,  Eq,  (2))  where  Ka  =  6K-i  \  1. 

Determination  of  the  subspaces  Wfc  and  sub-models  (jTjjk, ''  t(kj))^t(kj))  at  the  second-level  can  again 
be  viewed  as  a  two-step  estimation  problem,  in  which  further  splitting  of  the  sub-models  is  determined 
within  each  of  the  clusters  identi.,.ed  at  the  top-level  such  that  its  internal  structures  can  be  further 
explored  over  cluster-focused  x-space.  The  initial  estimate  of  can  be  obtained  using  a  probabilistic 
APEX  (PAPEX)  algorithm. 

P 

The  corresponding  t(k,j))Ct(kj))  can  again  be  estimated  using  lEM  algorithm  to  al¬ 

low  a  SFNM  distribution  with  K2,k  sub-models  to  be  ...tted  to  cluster  k,  where  the  user  will  pin-point 
the  initial  subcluster  centers  ^x?kj)  assign  =  '\lK2k  and  =WjCtkWfc  to  initialize 

P 

with  a  model  selection  procedure.  By  replacing  Zikiti  j  ''^k)  'n  PA¬ 
PEX  formulation  with  t(kj)  i  ^  tk)/  W*  is  updated  by  a  DCA- PAPEX  procedure  to  generate  a 
separability-based  and  cluster -focused  subspace  for  model  k  at  the  second-level. 

The  construction  of  the  entire  hierarchical  tree  is  completed  when  no  further  data  splitting  is  rec¬ 
ommended  in  all  of  the  parent  subspaces,  followed  by  the  generation  of  the  bottom-level  subspaces 
(for  example,  the  third-level).  The  value  of  VJ{k,j)  is  obtained  using  the  PAPEX  algorithm  with  z^kj) 
instead  of  zik,  and  all  data  points  x^kj)  i  ^  t(k,j))  will  appear  in  every  plot  of  the  total  Kq 

subspaces  at  the  bottom-level,  with  their  opaque  degree  equal  or  proportional  to  z^kj)- 

IV.  Results 

A  demonstration  of  the  capability  of  ...nding  cluster  structure  by  PCA/PPM,  DCA,  and  PPM  is 
...rst  done  on  a  simulated  data  set  that  consists  of  three  dimensions  and  four  clusters  (N=100  for 
each  cluster).  The  results  are  illustrated  in  Figure  1,  where  we  can  see  that  the  maximal  information 
about  the  cluster  structure  is  revealed  by  using  DCA,  PCA/PPM,  and  ICA/PPM  comparing  to  the 
conventional  PCA.  The  four  cluster  structure  is  clearly  shown  in  (b),  (c),  and  (d)  in  Figure  1 ,  but  only 
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three  clusters  are  seen  in  the  PCA  plot  (a)  without  incorporating  color  information  in  all  plots.  We 
shall  make  it  clear  that  PCA,  PCA/PPM  and  ICA/PPM  are  all  completely  unsupervised  processes 
without  replying  on  the  known  class  information,  except  DCA  that  is  supervised  in  the  case.  The  class 
information  is  used  only  to  show  the  four  distinct  classes  with  four  dinerent  colors  and  symbols. 

The  model  selection  of  simulated  data  at  the  top  level  projection  generated  by  DCA  is  performed, 
and  the  results  showed  that  a  four-dusts'  structure  may  best  describe  the  data  distribution  on  this 
level.  In  the  Figure  2,  ...ve  di^erent  model  selection  patterns  are  tested  and  MDL  curve  is  plotted,  and 
the  MDL  suggests  that  the  four- cluster  structure  is  best. 

A  hierarchical  visualization  trees,  as  shown  in  Figure  3(a),  is  generated  on  the  simulated  data  set. 
The  top  ...gure  is  a  top  level  projection  of  the  complete  data  set,  where  we  can  only  see  three  clusters, 
the  middle  ...gure  is  a  second  level  projection  that  provides  individual  di^erent  views  of  the  three  sub¬ 
clusters  selected  in  the  top  level  projection.  In  the  second  level,  we  can  see  two  hidden  clusters  in 
sub-cluster  #1,  this  gives  the  user  opportunities  to  discover  more  information  about  data  structure, 
and  it  makes  further  partitioning  possible.  Clusters  are  partitioned  and  shown  in  their  own  windows 
in  the  bottom  ...gure.  In  this  experiment,  only  conventional  PCA  is  used  to  generate  all  projections. 

To  illustrate  a  Joint  way  of  the  discovery  data  structure  by  combining  PCA,  DCA,  PCA/PPM 
and  ICA/PPM,  we  also  tested  PCA/PPM,  DCA,  and  ICA/PPM  in  the  generation  of  hierarchical 
visualization  tree.  In  this  case,  since  conventional  PCA  is  unable  to  locate  the  directions  in  the 
space  where  real  cluster  structure  can  be  displayed,  we  can  use  PCA/PPM,  DCA,  and  ICA/PPM  as 
alternatives.  When  DCA  is  used  to  plot  the  top  level  projection  in  Figure  3(b),  more  information 
about  the  cluster  structure  is  revealed  after  the  ...rst  step,  i.e.,  the  directions  to  show  the  four-cluster 
structure  are  found  in  the  space  by  DCA.  For  the  simulated  data  set,  the  hierarchical  visualization  by 
DCA  is  used  as  an  illustration  because  DCA,  PCA/PPM  and  ICA/PPM  are  similar  to  one  another. 
The  two  hierarchical  trees  in  Figure  3  are  produced  independently.  The  consistency  of  the  clustering 
results  and  the  known  class  grouping  can  be  seen  from  the  color  of  the  data  points  in  each  individual 
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window  on  the  bottom  level.  The  fact  that  the  data  points  within  one  cluster  have  the  same  color 
implies  that  the  data  points  belonging  to  the  same  phenotype  group  are  grouped  into  one  cluster. 

Besides  testing  the  methodology  on  the  simulated  data,  we  also  evaluate  the  real  microarray  data  sets 
from  National  Cancer  Institute  (NCI)  and  Massachusetts  Institute  of  Technology  (MIT).  In  the  2308 
dimensional  (genes)  microarray  data  sets  of  round  blue  cell  tumors  from  NCI,  there  are  four  classes: 
neuroblastoma  (NB;  N=12),  rhabdomyosarcoma  (RMS;  N=21),  non-Hodgkin  lymphoma  (NHL;  N=8) 
and  the  Ewing  family  of  tumors  (EWS;  N=23).  Figure  4  illustrates  how  all  projection  methods  (PCA, 
PCA/PPM,  DCA,  and  ICA/PPM)  on  NCI  data  explore  cluster  structure.  According  to  our  experience, 
these  four  methods  may  have  some  extent  on  ...nding  direction  for  discovering  cluster  structure  in 
various  cases.  Thus,  using  all  applications  to  examine  the  data  structure  are  more  informative  than  one 
or  two  methods  combined.  As  discussed  above,  in  this  experiment,  the  known  phenotype  information 
has  been  only  used  for  color  ...gure  plotting  in  PCA,  PCA/PPM  and  ICA/PPM  experiments  rather 
than  in  projection  searching,  except  for  DCA  experiment  that  is  supervised. 

PCA  and  PCA/PPM  on  the  NCI  data  generate  hierarchical  visualization  trees  (Figure  5).  Using 
PCA  to  generate  2-D  projections  produces  the  left  hierarchical  tree  Figure  5(a),  and  applying  the 
PCA/PPM  generates  the  right  tree  Figure  5(b).  MDL  curves  are  also  plotted  for  the  two  dioerent 
top  level  projections.  The  curves  indicate  that  the  three-cluster  from  PCA  and  the  four-cluster 
structures  from  PCA/PPM  are  best  in  describing  the  data  distribution  at  the  top  level  projections. 
The  number  of  clusters  determined  by  MDL  agrees  with  the  user  visual  inspection.  As  discussed  in 
the  hierarchical  exploration  experiment  on  the  simulated  data,  the  consistency  of  the  clustering  and 
the  known  biological  phenotypes  de..ned  above  is  shown  by  the  uni...ed  color  of  data  points  within 
each  cluster.  The  clustering  scheme  recommended  by  MDL  measure  also  indicates  the  consistency 
of  the  biological  phenotype  information  and  the  ...nal  clustering  in  the  bottom  level  of  Figure  5.  In 
Figure  5  (a),  although  MDL  cannot  catch  the  four-cluster  structure  on  the  top  level,  the  projections 
on  the  second  level  do  provide  good  opportunity  for  discovering  two  clusters  within  the  sub-cluster 
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#1  (mixed  green  and  yellow)  on  the  top  level.  This  also  demonstrates  the  advantages  of  hierarchical 
visual  exploration  scheme  on  the  cluster  discovery.  MDL  recommends  a  four  cluster  structure  on  the 
top  level  of  Figure  5(b),  the  well  partitioned  four  clusters  on  the  bottom  level  show  the  ability  of  MDL 
to  the  ...nding  the  true  cluster  structure. 

In  studying  leukemia,  MIT  published  a  7129  dimensional  microarray  data  sets  that  contain  two 
classes,  acute  lymphoblastic  leukemia  (ALL;  N=47)  and  acute  myeloid  leukemia  (AML;  N=25).  The 
2-D  projections  of  the  MIT  data  are  presented  in  Figure  6  .  All  four  projections  on  the  leukemia  data 
sets  explore  the  data  structure  similarly  to  the  other  cases  (simulated  and  NCI  data  sets). 

DCA  is  meaningful  with  the  model  selection  and  the  cluster  partitioning  even  under  unsupervised 
condition,  i.e.  no  class  information  is  known.  \l\k  demonstrate  this  idea  by  combining  PCA  and 
DCA  dynamically  at  the  top  level  projection.  In  Figure  7,  the  top  level  projection  in  the  top  ...gure 
is  generated  by  PCA  (unsupervised  analysis)  without  knowing  any  class  information.  After  model 
selection  and  partitioning  (provide  class  information),  DCA  (supervised  analysis)  can  now  produce 
the  re-projection  of  the  data  (middle  ...gure).  The  bottom  ...gure  is  a  partition  of  the  re-prqjected 
data.  Even  though  the  PCA  projection  (top  ...gure)  can  be  directly  partitioned  into  sub-clusters 
(bottom  ...gure),  the  additional  step  using  DCA  (middle  ...gure)  provides  another  chance  to  visualize 
the  complete  data  set  from  di^erent  angles  in  which  cl uster  structure  is  emphasized.  As  i n  this  example, 
this  data  projection  scenario  is  especially  useful  when  cluster  structure  is  ambiguous  to  the  user  in 
one  projection,  but  becomes  clear  after  DCA  is  applied  based  on  the  user  model  selection  and  cluster 
partitioning.  From  the  clustering  results  shown  on  the  bottom  level  of  Figure  7,  we  can  conclude  that 
the  clustering  is  fairly  consistent  with  the  biological  phenotype  information  since  only  a  few  mixed 


groupings  occur. 


JOURNAL  0^  SIGNAL  PROCESSING  SYSTEMS:  WANG,  ZHANG.  LU,  LEE,  KUNG,  CLARKE 


19 


V.  Discussion 

The  revelation  of  growing  volume  of  high  dimensional  and  multi-modal  data  demands  a  data  mining 
tool  di Paring  from  cxinventional  data  visualization  method,  which  is  capable  of  dealing  with  high  dimen¬ 
sional  data  set.  The  hierarchical  visualization  paradigm  involving  hierarchical  statistical  models  and 
visualization  space  is  proven  to  be  able  to  enectively  discover  data  structure  and  capture  all  interesting 
aspects  of  the  data.  Using  several  complementary  visualization  subspaces  makes  this  complicated  task 
feasible.  The  strategy  of  the  hierarchical  data  exploration  and  mining  tool  used  for  cluster  discovery 
is  that  the  top  levei  model  and  projection  should  explain  the  whole  structure  of  the  data  set,  while 
lower  level  models  explain  the  local  and  internal  structure  between  individual  cluster,  which  may  not 
be  obvious  in  the  high  level  models.  With  several  complementary  mixture  models  and  visualization 
projections,  each  level  will  be  relatively  simple  while  the  complete  hierarchy  maintains  overall  texibil- 
ity  yet  still  conveys  considerable  cluster  information.  In  this  algorithm,  dimensionality  reduction  and 
cluster  decomposition  are  two  major  components.  The  dimensionality  reduction  allows  visualization  of 
high  dimensional  data  and  less  computational  demand.  The  cluster  decomposition  provides  relatively 
simple  models  by  partitioning  large  and  complicated  mixture  models  into  small  local  structure,  which 
oners  great  ease  of  interpretation  and  many  bene...ts  of  analytical  and  computational  simpli...cation. 

The  techniques  involved  are  statistical  modeling  of  the  high  dimensional  data  with  SFNM  distribu¬ 
tion,  2-D  data  projection  Jointly  presented  by  unsupervised  and  supervised  data  mining  scheme,  and 
evaluation  of  cluster  structure  produced  by  such  scheme  using  microarray  experiments  with  known 
phenotypes.  Di°erent  from  conventional  PCA,  the  PCA/PPM,  DCA  and  ICA/PPM  project  entire 
data  set  and  the  softly  partitioned  sub-clusters  onto  multiple  2-D  visual  subspaces,  which  makes  every 
subcluster /subspace  be  discriminatively  explored  individually  so  that  local  cluster  structure  is  eoec- 
tively  revealed.  Furthermore,  the  PAPEX  and  incremental  expectation-maximization  (I EM)  procedure 
are  implemented  to  estimate  SFNM  distribution.  With  the  model-based  approach,  a  model  selection 
procedure  to  determine  the  number  of  sub-clusters  within  each  cluster  using  the  minimum  description 
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length  criterion.  This  allows  algorithm  to  automatically  determine  whether  a  further  split  of  a  subspace 
should  continue  in  completing  the  whole  hierarchy  [16].  User  interaction  with  the  algorithm  is  also  an 
important  issue.  The  user-friendly  graphical  interface  facilitates  the  data  visualization  purpose,  which 
allows  the  user  to  select  initial  centers  of  the  data  clusters.  Our  experience  has  convincingly  indicated 
a  great  reduction  of  both  computational  complexity  and  local  optimum  likelihood.  Although  the  ...nal 
SFNM  model  can  be  estimated,  the  pathways  of  achieving  cluster  decomposition  may  be  multiple. 
This  user-driven  nature  of  the  current  algorithm  is  also  highly  appropriate  for  the  visualization  con¬ 
text.  With  such  features  in  the  data  mining  algorithm,  our  tools  can  explore  data  structure  in  great 
extents  with  no  standard  data  analyses  may  compare. 
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included. 
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Conventional  PCA  PCA/PPM 
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Fig.  1.  The  2-D  projections  of  the  simulation  data  resulting  from  conventional  PCA,  PCA/PPM,  DCA,  and  ICA/PPM. 


Fig.  2.  The  model  selection  and  the  MDL  curve  of  the  simulated  data  at  the  top  level  projection. 
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Fig.  3.  The  hierarchical  visualization  trees  of  the  simulated  data.  The  left  ...gure  (a)  is  generated  by  using  PCA,  and 


right  ...gure  (b)  are  generated  by  using  DC  A. 
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Fig.  4.  The  2-D  Projections  of  the  NCI  data  resulting  from  PCA,  PCA/PPM,  DCA,  and  ICA/PPM, 
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Fig.  5.  The  hierarchical  visualization  trees  of  NCI  data.  The  left  ...gure  (a)  is  generated  by  using  PCA,  and  right  ...gure 


(b)  is  generated  by  using  PCA/PPM.  MDL  curves  for  the  top  level  projections  are  also  included. 
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Fig.  6.  The  2-D  Projections  of  the  MIT  data  resulting  from  PCA,  PCA/PPM,  DC  A,  and  ICA/PPM. 


Fig.  7.  The  2-D  Projections  of  the  MIT  data  resulting  from  PCA,  PCA/PPM,  DCA,  and  ICA/PPM. 
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Antiestrogens  include  agents  such  as  tamoxifen,  toremi- 
fene,  raloxifene,  and  fulvestrant.  Currently,  tamoxifen  is 
the  only  drug  approved  for  use  in  breast  cancer 
chemoprevention,  and  it  remains  the  treatment  of  choice 
for  most  women  with  hormone  receptor  positive,  invasive 
breast  carcinoma.  While  antiestrogens  have  been  available 
since  the  early  1970s,  we  still  do  not  fully  understand  their 
mechanisms  of  action  and  resistance.  Essentially,  two 
forms  of  antiestrogen  resistance  occur:  de  novo  resistance 
and  acquired  resistance.  Absence  of  estrogen  receptor 
(ER)  expression  is  the  most  common  de  novo  resistance 
mechanism,  whereas  a  complete  loss  of  ER  expression  is 
not  common  in  acquired  resistance.  Antiestrogen  unre¬ 
sponsiveness  appears  to  be  the  major  acquired  resistance 
phenotype,  with  a  switch  to  an  antiestrogen-stimulated 
growth  being  a  minor  phenotype.  Since  antiestrogens 
compete  with  estrogens  for  binding  to  ER,  clinical 
response  to  antiestrogens  may  be  affected  by  exogenous 
estrogenic  exposures.  Such  exposures  include  estrogenic 
hormone  replacement  therapies  and  dietary  and  environ¬ 
mental  exposures  that  directly  or  indirectly  increase  a 
tumor’s  estrogenic  environment.  Whether  antiestrogen 
resistance  can  be  conferred  by  a  switch  from  predomi¬ 
nantly  ERa  to  ER^  expression  remains  unanswered,  but 
predicting  response  to  antiestrogen  therapy  requires  only 
measurement  of  ERa  expression.  The  role  of  altered 
receptor  coactivator  or  corepressor  expression  in  anti¬ 
estrogen  resistance  also  is  unclear,  and  understanding 
their  roles  may  be  confounded  by  their  ubiquitous 
expression  and  functional  redundancy.  We  have  proposed 
a  gene  network  approach  to  exploring  the  mechanistic 
aspects  of  antiestrogen  resistance.  Using  transcriptome 
and  proteome  analyses,  we  have  begun  to  identify 
candidate  genes  that  comprise  one  component  of  a  larger, 
putative  gene  network.  These  candidate  genes  include 
NFkB,  interferon  regulatory  factor-1,  nucleophosmin,  and 
the  X-box  binding  protein-1.  The  network  also  may 
involve  signaling  through  ras  and  MARK,  implicating 
crosstalk  with  growth  factors  and  cytokines.  Ultimately, 
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signaling  affects  the  expression/function  of  the  prolifera¬ 
tion  and/or  apoptotic  machineries. 

Oncogene  (2003)  22, 7316-7339.  doi:10.1038/sj.onc.  1206937 
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Introduction 

Antiestrogens  primarily  act  by  competing  with  estrogens 
for  binding  to  the  estrogen  receptor  (ER)  and  are  the 
most  widely  administered  endocrine  agents  for  the 
management  of  ER-expressing  breast  cancers.  The  first 
antiestrogens  were  generated  in  the  mid-1950s  as  fertility 
agents  and  included  ethamoxytriphetol  (MER-25)  and 
clomiphene.  The  ability  of  these  compounds  to  induce 
responses  in  some  breast  cancer  patients  soon  became 
apparent  (Kistner  and  Smith,  1960),  but  the  compounds 
induced  significant  toxicity  (Herbst  et  a/.,  1964).  In  the 
early  1970s,  the  first  study  in  breast  cancer  patients  was 
published  with  a  new  antiestrogen  tamoxifen  (TAM,  ICI 
46474)  (Cole  et  ai,  1971).  Over  the  next  17  years,  the 
total  exposure  to  TAM  reached  1.5  million  patient  years 
(Litherland  and  Jackson,  1988)  and  other  selective 
estrogen  receptor  modulators  (SERMs)  are  being 
developed  and  studied.  TAM  is  now  the  most  frequently 
prescribed  antiestrogen,  and  compelling  data  have 
demonstrated  a  significant  overall  survival  benefit  with 
the  administration  of  this  agent  in  breast  cancer  patients 
with  endocrine  responsive  disease  (EBCTCG,  1992, 

1998) . 

When  compared  with  cytotoxic  chemotherapy,  anti¬ 
estrogens  are  well  tolerated  and  are  associated  with 
mostly  minor  toxicities  (Love,  1989).  Common  side 
effects  associated  with  TAM  therapy  include  vasomotor 
symptoms,  gastrointestinal  disturbance,  atrophic  vagi¬ 
nitis,  and  changes  in  sexual  functioning  (Day  et  ai, 

1999) .  While  the  frequency  and  severity  of  hot  flashes 
and  other  toxicities  can  be  particularly  unpleasant  for 
some  women,  remarkably  few  discontinue  TAM  be¬ 
cause  of  these  side  effects.  Medical  indications  for  the 
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prompt  discontinuation  of  therapy  include  associated 
venous  thromboembolic  disease  and  endometrial  cancer 
(typically  invasive  adenocarcinoma,  although  uterine 
sarcomas  have  been  reported).  The  incidence  of  these 
events  is  very  low,  and  screening  methods  for  both  deep 
vein  thrombosis  and  endometrial  abnormalities  exist. 
However,  these  increased  risks  must  be  considered  in  the 
light  of  the  potential  benefits — particularly  in  the  case  of 
healthy  women  considering  TAM  in  the  setting  of 
chemoprevention  as  opposed  to  active  treatment.  The 
development  of  both  venous  thromboembolic  disease 
and  endometrial  cancer  is  attributed  to  the  estrogenic 
effects  of  TAM  and  may  be  abrogated  by  the  develop¬ 
ment  of  more  SERMs  (e.g.,  raloxifene)  or  of  pure  ER 
antagonists  (e.g.,  ICI  182,780;  fulvestrant)  (Robertson, 
2001). 

Some  antiestrogens  produce  beneficial  effects  beyond 
their  ability  to  inhibit  existing  breast  cancers.  The  most 
convincing  evidence  supports  an  association  between 
TAM  treatment  and  a  marked  reduction  in  the  risk  of 
developing  a  contralateral  breast  cancer  (EBCTCG, 
1992)  and  a  significant  reduction  in  the  incidence  and 
severity  of  osteoporosis  in  postmenopausal  women 
(Freedman  et  al,  2001;  Kinsinger  et  aL,  2002).  Several 
early  studies  suggested  a  reduction  in  the  risk  of 
cardiovascular  disease  with  TAM  therapy,  but  this  is 
not  consistently  reported  (EBCTCG,  1998;  Fisher  et  aL, 
1998).  When  observed,  the  cardiovascular  benefit  was 
usually  attributed  to  the  estrogenic  effects  of  TAM;  both 
estrogens  and  TAM  produce  apparently  beneficial 
changes  in  serum  triglyceride  and  cholesterol  concentra¬ 
tions  (Joensuu  et  al.,  2000),  perhaps  through  effects 
mediated  by  apolipoprotein  E  (Liberopoulos  et  al, 
2002).  However,  these  findings  must  be  considered  in  the 
light  of  recent  large  studies  of  estrogenic  hormone 
replacement  therapy  (HRT)  that  either  failed  to  identify 
an  HRT-induced  reduction  in  coronary  heart  disease 
(Hulley  et  al,  1998;  Grady  et  al.,  2002;  WHI,  2002)  and 
stroke  (Viscoli  et  al,  2001;  WHI,  2002),  or  demon¬ 
strated  an  increase  in  the  risk  of  these  diseases. 

An  overview  of  antiestrogen  resistance 

Despite  the  relative  safety  and  significant  antineoplastic 
and  chemopreventive  activities  of  antiestrogens,  most 
initially  responsive  breast  tumors  acquire  resistance 
(Clarke  et  al,  2001b).  It  is  unlikely  that  any  single 
mechanism  or  single  gene  confers  antiestrogen  resis¬ 
tance.  Rather,  several  mechanisms  likely  exist  that 
encompass  pharmacologic,  immunological,  and  mole¬ 
cular  events.  These  mechanisms,  none  of  which  are  fully 
understood,  likely  vary  within  tumors.  Intratumor 
variability  in  antiestrogen  responsiveness  will  reflect 
the  presence  of  multiple  cell  subpopulations  (Clarke 
et  al,  1990a).  Since  breast  cancers  appear  highly  plastic 
and  adaptable  to  selective  pressures,  the  intratumor 
diversity  in  antiestrogen  responsive  subpopulations  also 
likely  changes  over  time.  Tumors  appear  capable  of 
dynamically  remodeling  their  cell  populations  in  re¬ 
sponse  to  changes  in  host  immunity  or  endocrinology, 
or  the  administration  of  local  or  systemic  therapies.  This 


plasticity  is  probably  both  cellular  (some  existing 
populations  die  out/back  while  other  populations 
become  dominant)  and  molecular  (new  cell  populations 
emerge  as  individual  cells/populations  adapt  their 
phenotypes  by  modifying  their  transcriptomes/pro- 
teomes). 

Since  the  major  pharmacologic  and  immunologic 
mechanisms  of  antiestrogen  resistance  have  been  pre¬ 
viously  reviewed  (Clarke  et  al,  2001b),  we  will  focus  on 
the  role  of  molecular  signaling  through  ER-mediated 
activities  in  antiestrogen  responsiveness.  Antiestrogen 
resistance  can  be  either  de  novo  or  acquired.  The  most 
common  and  best  defined  mechanism  of  de  novo 
resistance  is  the  absence  of  both  ER  and  progesterone 
receptor  (PR)  expressions.  However,  we  fail  to  predict 
response  to  antiestrogens  in  approximately  25%  of 
ER  +  /PR  +  ,  66%  of  ER  +  /PR-,  and  55%  of  ER-/ 
PR+  breast  tumors  (Honig,  1996).  ManyER-f  and/or 
PR  +  breast  tumors  are  already  resistant  by  the  time  of 
diagnosis  and  the  resistance  mechanism  in  these  tumors 
is  unknown. 

Overall,  a  loss  of  antiestrogen  responsiveness  by 
initially  responsive  tumors  is  likely  to  be  the  most 
common  acquired  resistance  phenotype.  Most  initially 
antiestrogen  responsive  tumors  retain  levels  of  ER 
expression  at  recurrence  on  antiestrogen  therapy  that 
would  still  define  them  as  being  ER-{-  (Encarnacion 
et  al,  1993;  Kuukasjarvi  et  al,  1996;  Bachleitner- 
Hofmann  et  al,  2002).  Most  data  are  for  TAM 
treatment;  ICI  182780,  which  causes  degradation  of 
ER  (Dauvois  et  al,  1992),  may  have  a  greater  potential 
for  producing  ER-  tumors  (Kuukasjarvi  et  al,  1996). 
From  our  in  vitro  studies,  loss  of  ER  is  not  required  to 
achieve  resistance  to  either  ICI  182,780  or  TAM 
(Brunner  et  al,  1993b,  1997).  The  loss  of  ER  expression 
upon  recurrence  despite  adjuvant  TAM  therapy  has 
been  reported  in  less  than  25%  of  tumors  (Kuukasjarvi 
et  al,  1996;  Bachleitner-Hofmann  et  al,  2002).  Overall, 
a  loss  of  ER  expression  does  not  seem  to  be  the  major 
mechanism  driving  acquired  antiestrogen  resistance. 

A  different  resistance  phenotype  has  been  described  in 
human  breast  cancer  xenografts  that  exhibit  a  switch  to 
a  TAM-stimulated  phenotype.  This  mechanism  of 
clinical  but  not  pharmacologic  resistance  may  not  be 
the  dominant  antiestrogen  resistance  phenotype.  If  the 
prevalence  of  acquired  resistance  phenotypes  in  ER-h 
tumors  broadly  reflects  what  is  seen  in  de  novo 
resistance,  then  the  dominant  resistance  phenotype  is  a 
loss  of  antiestrogen  responsiveness. 

Whether  the  continued  expression  of  ER  is  required 
for  antiestrogen-resistant  tumor  growth  or  survival  is 
not  known.  However,  responses  to  aromatase  inhibitors 
after  an  initial  response  and  then  failure  on  TAM  are 
common  (Buzdar  and  Howell,  2001)  and  strongly 
suggest  that  some  TAM-resistant  tumors  retain  a  degree 
of  estrogen  responsiveness.  Where  durations  of  re¬ 
sponses  to  second-line  endocrine  manipulations  are 
short,  truly  estrogen-independent  cell  populations  are 
either  already  present  at  the  time  of  recurrence  and/or 
many  cells  in  the  tumor  are  able  to  adapt  rapidly  to 
further  changes  in  their  endocrine  environment.  Very 
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short  response  durations  or  disease  stabilization  may  issue  may  be  addressed  with  the  concurrent  use  of 
reflect  the  withdrawal  of  a  mitogenic  stimulus  that  is  not  bisphosphonates  or  other  therapies  for  osteoporosis, 
required  for  the  survival  or  basal  proliferation  of  most  Clinical  experience  with  ICI  182,780  has  been  reviewed 
cells  in  the  tumor.  by  Howell  (2001). 


Antiestrogens 

TAM  is  a  triphenylethylene  and  its  triaryl  structure  has 
been  widely  copied  in  the  design  of  new  compounds. 
Several  TAM  derivatives  are  already  available,  includ¬ 
ing  toremifene  (chloro-tamoxifen)  and  droloxifene  (3- 
hydroxytamoxifen).  Not  surprisingly,  both  drugs  are 
essentially  equivalent  to  TAM  in  terms  of  their 
antitumor  activities  and  toxicides  (Roos  et  al,  1983; 
Pyrhonen  et  aL,  1999),  so  neither  is  widely  used  in 
clinical  practice. 

The  characteristic  of  raloxifene  that  has  attracted  the 
most  interest  is  its  apparent  lack  of  estrogenic  effects  in 
the  uterus,  resulting  in  great  interest  in  this  drug’s 
potential  role  in  breast  cancer  chemoprevention.  Sub¬ 
group  analysis  of  the  data  from  the  Multiple  Outcomes 
of  Raloxifene  (MORE)  trial  revealed  that  administra¬ 
tion  of  raloxifene  was  associated  with  a  75%  reduction 
in  the  incidence  of  invasive  breast  cancer  without  a 
concurrent  increase  in  the  incidence  of  endometrial 
cancers  (Cummings  et  al,  1999).  This  finding  has  led  to 
the  ongoing  randomized  study  of  TAM  and  raloxifene 
(STAR)  in  breast  cancer  prevention.  Raloxifene  still  acts 
as  an  antiestrogen  in  the  brain,  increasing  the  incidence 
of  hot  flashes  (Davies  et  ah,  1999).  A  high  incidence  of 
severe  hot  flashes  is  problematic  for  a  drug  to  be 
administered  for  approximately  5  years  to  otherwise 
apparently  healthy  women.  Raloxifene  was  recently 
approved  by  the  Food  and  Drug  Administration  for  the 
treatment  and  prevention  of  osteoporosis  in  postmeno¬ 
pausal  women.  While  a  benzothiophene,  raloxifene 
(keoxifene;  LY  156,758)  has  a  three-dimensional  struc¬ 
ture  broadly  similar  to  the  triphenylethylenes. 

ICI  182,780  (Faslodex;  Fulvestrant)  is  among  the 
more  promising  new  antiestrogens.  Unlike  TAM,  ICI 
182,780  is  a  steroidal  ER  inhibitor  that  is  often 
described  as  a  ‘pure’  antagonist  with  no  estrogenic 
activity.  This  is  in  comparison  to  the  triphenylethylene 
and  benzothiophene  antiestrogens,  which  are  nonster¬ 
oidal,  competitive  ER  inhibitors  with  partial  agonist 
activity.  The  pure  antagonist  is  characterized  by 
antineoplastic  activity  in  breast  cancer  and  is  devoid  of 
uterotropic  effects.  However,  the  lack  of  agonist  activity 
limits  beneficial  effects  in  bone.  Whether  ICI  182,780 
also  will  increase  hot  flashes  depends  on  whether  it 
reaches  adequate  concentrations  in  the  brain.  Unlike 
TAM  (Clarke  et  al,  1992),  ICI  182,780  appears  to  be  a 
substrate  for  the  P-glycoprotein  efflux  pump 
(De  Vincenzo  et  al,  1996),  a  major  contributor  to  the 
blood-brain  barrier  (Cordon-Cardo  et  ai,  1989).  Con¬ 
sistent  with  this  observation,  initial  studies  suggest  that 
this  antiestrogen  does  not  enter  the  brain  in  high 
concentrations  (Howell  et  al.,  1996).  Pure  antagonists 
may  further  exacerbate  bone  loss,  a  concern  that  also 
applies  to  aromatase  inhibitors  (Dowsett,  1997),  but  this 


Antiestrogens  and  breast  cancer  treatment 

Antiestrogens  are  effective  in  the  adjuvant,  metastatic, 
and  chemopreventive  settings  and  clearly  induce  sig¬ 
nificant  increases  in  overall  survival  in  some  breast 
cancer  patients  (EBCTCG,  1992,  1998).  Unlike  aroma¬ 
tase  inhibitors  (inhibit  estradiol  biosynthesis),  which  are 
administered  as  single  agents  only  to  women  with 
nonfunctioning  ovaries,  TAM  can  be  given  irrespective 
of  menopausal  status.  In  the  adjuvant  setting,  TAM  is 
administered  at  a  daily  oral  dose  of  20  mg,  and  several 
studies  have  now  shown  that  the  optimal  duration  of 
treatment  is  5  years.  While  shorter  (2  years)  and  longer 
(10  years)  treatment  durations  produce  notable  re¬ 
sponses,  the  risk :  benefit  ratios  are  strongly  in  favor  of 
5  years  of  treatment  (Stewart  et  ai,  1996;  EBCTCG, 
1998). 

While  molecular  predictors  of  tumor  responsiveness 
are  rare  for  most  breast  cancer  treatments,  expressions 
of  ER  and  PR  strongly  predict  for  a  response  to 
antiestrogens.  Up  to  75%  of  breast  tumors  expressing 
both  receptors  (ER-f-/PR4-)  respond  to  TAM.  Re¬ 
sponse  rates  are  somewhat  lower  in  ER  -f  /PR-  tumors 
(~34%)  and  ER-/PR+  tumors  (45%).  The  response 
rate  in  ER-/PR+  may  be  an  overestimate;  relatively 
few  tumors  with  this  phenotype  have  been  evaluated  and 
the  ER-  assessment  may  include  false-negative  ER 
measurements.  Only  a  small  proportion  of  ER—/VR~ 
tumors  respond  to  antiestrogens  (<  10%),  perhaps  also 
reflecting  false-negative  ER  measurements.  Indeed,  the 
most  recent  meta-analysis  from  the  Early  Breast  Cancer 
Trialists  Collaborative  Group  (EBCTCG)  found  no 
significant  reduction  in  recurrence  rates  in  patients  with 
ER-poor  tumors  who  received  adjuvant  TAM 
(EBCTCG,  1998). 

Results  of  the  1998  EBCTCG  meta-analysis  found 
limited  evidence  for  a  TAM-induced  increase  in  the  risk 
of  death  from  any  cause  in  women  with  ER-poor 
tumors.  Why  TAM  might  be  detrimental  to  some 
women  is  unclear.  However,  ER-  tumors  are  known 
to  exhibit  a  more  aggressive  phenotype  associated  with 
lower  rates  of  overall  survival  (Aamdal  et  ai,  1984)  and 
would  be  expected  to  recur  earlier  and  more  frequently. 
Estrogenic  effects  of  TAM  in  these  women  also  could 
have  increased  the  number  of  deaths  from  cardiovas¬ 
cular  disease  and  stroke,  reflecting  the  data  noted  above 
from  recent  studies  of  estrogenic  HRT  use  (Viscoli  et  al, 
2001;  WHI,  2002). 

Antiestrogens  and  breast  cancer  chemoprevention 

TAM’S  ability  to  inhibit  contralateral  breast  cancers  and 
relatively  low  incidence  of  serious  side  effects  led  to 
studies  into  its  potential  use  as  a  chemopreventive  agent 
for  patients  with  a  high  breast  cancer  risk.  Three  large, 
randomized,  chemoprevention  studies  with  TAM  have 
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been  performed  to  date:  the  NSABP  P-1  trial  (n  —  13  388 
participants)  (Fisher  et  at.,  1998),  the  Royal  Marsden 
Trial  (w  =  2471  participants)  (Powles  et  al,  1998),  and 
the  Italian  Chemoprevention  Trial  («  =  5408  partici¬ 
pants)  (Veronesi  et  al,  1998).  Outcomes  have  been 
mixed:  no  significant  reduction  in  risk  was  seen  in  the 
initial  reports  of  either  the  UK  or  Italian  trials,  whereas 
the  P-1  trial  reported  significant  reductions  in  the 
incidence  of  both  noninvasive  (50%)  and  invasive 
(49%)  breast  cancers.  A  recent  update  on  the  Italian 
Trial  reports  an  82%  TAM-induced  reduction  in  the 
breast  cancer  risk  among  women  at  high  risk  for  ER  4- 
breast  cancer  (Veronesi  et  al,  2003).  In  the  NSABP  trial, 
reductions  in  breast  tumor  incidence  were  seen  only  in 
the  incidences  of  ER+  tumors  (Fisher  et  ah,  1998). 
Reasons  for  the  disparities  among  the  trials  have  been 
widely  discussed;  these  tend  to  focus  on  differences  in 
patient  populations,  subject  eligibility  criteria,  and  study 
size.  Results  from  the  NSABP  P-1  trial,  which  are 
broadly  consistent  with  the  39%  reduction  in  contral¬ 
ateral  breast  cancer  incidence  reported  for  TAM  use 
(EBCTCG,  1992),  are  usually  considered  the  more 
definitive.  These  data  contributed  to  the  decision  by 
the  Federal  Drug  Administration  (USA)  in  October 
1998  to  allow  the  use  of  TAM  as  a  chemopreventive 
agent  for  breast  cancer.  More  recently,  NSABP  has 
reported  TAM-induced  reductions  in  the  risks  of 
adenosis,  fibrocystic  disease,  hyperplasia,  metaplasia, 
fibroadenoma,  and  fibrosis  in  the  P-1  trial  (Tan-Chiu 
et  al,  2003). 


Estrogens  and  breast  cancer 

Since  antiestrogen  action  and  resistance  are  intimately 
affected  by  estrogen  exposure,  we  briefly  address  the 
role  of  estrogens  in  breast  cancer.  An  association 
between  parity  and  breast  cancer  risk  was  observed  by 
the  1 6th  century  Italian  physician  Bernadino  Ramazzini 
(1633-1714)  in  his  Mortis  Artificium'  published  in 
1700.  The  ability  of  ovariectomy  to  induce  remissions  in 
premenopausal  breast  cancer  patients  was  shown  by  the 
Scottish  physician  George  Beatson,  the  first  clear 
evidence  of  an  effective  endocrine  therapy  for  this 
disease  (Beatson,  1896).  More  recent  epidemiologic 
data  show  clear  associations  of  early  age  at  menarche, 
late  age  at  menopause  (Nishizuka,  1992),  pregnancy 
(Hsieh  et  al.,  1994),  obesity  (Hulka  and  Stark,  1995), 
serum  estrogen  concentrations  (EHBCCG,  2002),  and 
use  of  estrogenic  HRTs  (Magnusson  et  al,  1999; 
Schairer  et  al,  1999,  2000)  or  oral  contraceptives 
(Berger  et  al,  2000)  with  an  increase  in  the  risk  of 
developing  breast  cancer.  Risk  appears  related  to  the 
timing  of  exposure  and  whether  the  cancer  develops 
during  the  premenopause  or  postmenopause  (Hilakivi- 
Clarke  et  ai,  2002). 

Precisely  how  estrogens  affect  breast  cancer  risk 
remains  controversial  and  outcome  may  be  dependent 
upon  the  timing  and  duration  of  exposure.  During  the 
postmenopausal  years,  estrogenic  stimuli  are  more 
closely  associated  with  an  increased  breast  cancer  risk. 


However,  we  have  recently  reviewed  evidence  consistent 
with  the  hypothesis  that,  depending  on  the  timing  of 
exposure,  increased  estrogenic  exposure  can  be  asso¬ 
ciated  with  a  reduced  risk  of  breast  cancer  (Hilakivi- 
Clarke  et  al,  2002).  For  example,  estrogenic  stimuli 
during  childhood  or  the  premenopausal  years  may  affect 
breast  development  such  that  the  breast  is  less  suscep¬ 
tible  to  transformation.  Estrogens  may  reduce  breast 
cancer  incidence  in  some  women  by  altering  mammary 
gland  development  and  inducing  the  expression  of  genes 
involved  in  DNA  repair  (Hilakivi-Clarke  et  al,  1999a; 
Hilakivi-Clarke,  2000). 

For  the  purposes  of  this  review,  we  will  focus  on  the 
aspects  of  estrogen  exposure  that  are  associated  with 
increased  breast  cancer  risk  and  the  survival/prolifera- 
tion  of  established  neoplastic  breast  cells.  Hence, 
estrogens  can  be  considered  to  act  either  as  promoters 
(factors  that  stimulate  the  growth  and/or  survival  of 
existing  transformed  cells)  or  as  initiators  (factors  that 
induce  the  genetic  damage  that  leads  to  cellular 
transformation).  Evidence  that  estrogens  are  tumor 
promoters  is  well  established  from  both  experimental 
and  clinical  observations.  For  example,  the  growth  of 
several  human  breast  cancer  cell  lines  in  vitro  and  in  vivo 
is  stimulated  by  estrogenic  supplementation.  Indeed, 
such  estrogenic  supplementation  is  effective  whether 
administered  as  classical  estrogens  (e.g.,  estradiol, 
estrone,  or  estriol)  or  plant-derived  phytoestrogens  such 
as  the  isoflavone  genistein  (Hsieh  et  al,  1998).  In 
addition,  antiestrogens,  aromatase  inhibitors,  leutinizing 
hormone  releasing  hormone  agonists/antagonists,  and 
ovariectomy  are  effective  in  the  treatment  of  some 
breast  cancer  patients,  all  of  which  limit  the  interaction 
between  a  promotional  (estrogenic)  stimulus  and  cancer 
cells. 

As  tumor  promoters,  the  effects  of  estrogens  are 
related  to  the  duration  and  timing  of  exposure.  With¬ 
drawal  of  an  estrogenic  stimulus  that  acts  as  a  promoter 
could  produce  an  eventual  reduction  in  risk  because  it 
no  longer  promotes  the  growth  or  survival  of  existing 
cancer  cells.  Pregnancy  produces  a  natural  and  sig¬ 
nificant  increase  in  circulating  estrogens,  but  only  a 
transitory  increase  in  breast  cancer  risk  in  young 
women.  Indeed,  if  the  first  pregnancy  was  at  a  young 
age,  the  short-term  increase  may  eventually  translate 
into  a  lifetime  reduction  in  breast  cancer  risk  (Hsieh 
et  al,,  1994).  The  increased  breast  cancer  risk  associated 
with  either  oral  contraceptive  or  estrogenic  HRT  use  is 
also  related  to  the  recency  of  use.  Risk  begins  to  reduce 
with  the  cessation  of  use  and  is  highest  in  current  users 
(CGHFBC,  1996;  Schairer  et  al,  2000). 

Evidence  that  estrogens  act  as  chemical  initiators  is 
more  controversial.  Estrogens  can  exhibit  carcinogenic 
activity  in  some  animal  models;  perhaps  the  best-known 
example  is  the  ability  of  estrogens  to  induce  renal 
cancers  in  Syrian  hamsters  (Kirkman,  1972).  However, 
compelling  evidence  that  estrogens  initiate  mammary 
cancer  in  animals  is  hard  to  find.  In  the  1930s, 
Lacassagne  (1932)  performed  several  studies  in  male 
mice  and  showed  that  administration  of  large  doses  of 
estrone  can  induce  mammary  tumors.  While  consistent 
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with  an  estrogen-mediated  initiation  of  mammary 
cancer,  it  is  possible  that  the  mice  were  infected  with 
the  mouse  mammary  tumor  virus  (MMTV).  Other  than 
some  transgenic/null  mouse  models,  only  in  the  ACI  rat 
does  estrogen  administration  reproducibly  produce  a 
high  incidence  of  mammary  tumors  (Cavalieri  and 
Rogan,  2002). 

Reactive  estrogen  semiquinone/quinone  intermedi¬ 
ates,  produced  by  the  redox  cycling  of  estrogen 
metabolites  hydroxylated  at  the  C3  and  C4  positions 
of  the  aromatic  A-ring,  are  the  most  likely  estrogen 
initiators  (Cavalieri  et  al,  1997;  Bishop  and  Tipping, 
1998;  Cavalieri  and  Rogan,  2002).  These  reactive  species 
can  generate  a  substantial  intracellular  oxidative  stress 
and  directly  damage  DNA  through  the  production  of 
DNA  adducts.  Such  events  could  define  reactive 
estrogen  metabolites  as  initiators,  rather  than  as  merely 
promoters  of  carcinogenesis.  Recently,  the  National 
Toxicology  Program  (2003)  listed,  for  the  first  time, 
steroidal  estrogens  as  carcinogens. 

Estrogen  independence  and  antiestrogen  resistance 

Estrogen  independence  and  antiestrogen  resistance  are 
often  considered  to  be  synonymous,  which  is  not 
surprising  since  ER-  tumors  are  definitively  estrogen- 
independent  and  very  rarely  respond  to  antiestrogens, 
ovariectomy,  or  aromatase  inhibitors.  Nonetheless, 
several  observations  suggest  that  various  forms  of  both 
estrogen  independence  and  antiestrogen  resistance  exist 
and  that  these  may  be  biologically  and  clinically  very 
different.  For  example,  second-line  responses  to  aroma¬ 
tase  inhibitors  after  response  and  recurrence  on  TAM 
are  common  (Goss  et  al.,  1995;  Buzdar  et  al,  1996). 
Crossover  between  more  similar  compounds,  such  as 
other  nonsteroidal  antiestrogens,  rarely  produces  sec¬ 
ondary  responses  (Johnston,  2001),  although  crossover 
to  structurally  different  antiestrogens  can  produce 
secondary  responses  in  patients.  Tumors  that  respond 
first  to  TAM  (triphenylethylene)  show  a  marked 
response  to  ICI  182,780  (steroidal)  administered  upon 
failure  of  the  TAM  therapy  (Howell  et  al.,  1995).  Similar 
patterns  of  responses  were  seen  previously  in  experi¬ 
mental  models  (Brunner  et  al.,  1993b).  For  example, 
MCF-7  human  breast  cancer  cells  were  selected  for  the 
ability  to  grow  in  the  absence  of  estrogens  (Clarke  et  al., 
1989a).  The  selected  cells  are  estrogen-independent 
because  they  no  longer  require  estrogens  for  growth 
either  in  cell  culture  or  as  xenografts  in  athymic  nude 
mice.  However,  when  exposed  to  either  4-hydroxyta- 


moxifen  or  ICI  182,780,  the  cells  are  growth  inhibited 
both  in  vitro  and  in  vivo  (Clarke  et  al.,  1989a;  Brunner 
etal,  1993a,  b). 

These  observations  strongly  imply  that  the  ability  of 
breast  cancer  cells  to  grow  in  a  low  or  nonestrogenic 
environment  is  not  always  synonymous  with  antiestro¬ 
gen  resistance.  Four  antiestrogen  resistance  phenotypes 
have  been  defined  (Clarke  and  Brunner,  1995)  and  are 
shown  in  Table  1.  The  clinical  applicability  of  these 
phenotypes  remains  to  be  determined  but  they  are  useful 
for  defining  resistance  phenotypes  in  experimental 
models. 


Intratumor  estrogens  and  antiestrogens  and  exogenous 
estrogenic  exposures 

Antiestrogens  act  within  cells,  primarily  to  compete  with 
available  estrogens  for  binding  to  ER.  Thus,  the 
antiestrogenic  potency  of  any  compound  is  related  to 
its  affinity  for  ER  relative  to  that  of  any  estrogens 
present  and  the  concentrations  of  both  the  antiestrogens 
and  estrogens.  The  data  in  Table  2  show  the  relative 
affinities  of  the  primary  estrogens,  antiestrogens  and 
their  major  metabolites,  and  selected  environmental 
estrogens  and  phytoestrogens.  Intratumor  estrogen 
concentrations  are  affected  by  several  factors  including 
serum  estrogen  concentrations  and  local  estrogen 
production  within  the  breast.  Serum  estrogen  concen¬ 
trations  are  affected  by  the  presence  or  absence  of 
functional  ovaries  and  exogenous  estrogen  use  such  as 
HRT,  some  oral  contraceptives,  and  various  dietary 
components. 

Passive  diffusion  into  cells  across  the  plasma  mem¬ 
brane  appears  to  be  TAM’s  and  estradiols’s  primary 
method  of  entry  into  cells.  However,  both  TAM  and 
estrogens  are  extensively  bound  to  serum  proteins  and 
probably  also  to  cellular  proteins  in  tumor/nontumor 
cells  within  the  breast  (Clarke  et  al.,  2001b).  Release 
from  serum  proteins  likely  occurs  within  the  tumor 
vasculature,  with  both  estrogens  and  antiestrogens  being 
subsequently  sequestered  within  tumor/nontumor  cells 
by  intracellular  proteins.  The  lipophilicity  of  both 
hormone  and  drug,  and  the  significant  amount  of 
adipose  tissue  in  the  breast,  may  produce  a  local 
reservoir  for  both  estrogens  and  antiestrogens.  How¬ 
ever,  the  concentration  of  free  drug/hormone  within 
cells  and  serum  may  be  relatively  low.  Intracellular 
sequestration  of  drug/hormone  in  tumor  and  stromal 
cells  could  produce  a  concentration  gradient  favoring 


Table  1  Anticstrogen  resistance  phenotypes 

Antiestrogen  resistance  Phenotype 


Type  1  Fully  responsive  to  antiestrogens  and  aromatase  inhibitors 

Type  2  Resistant”  to  nonsteroidal  antiestrogens  but  responsive  to  ICI  182,780  and  aromatase  inhibitors 

(or  resistant  to  ICI  182,780  but  responsive  to  nonsteroidal  antiestrogens  and  aromatase  inhibitors) 
Type  3  Resistant  to  all  anticstrogens  but  potentially  responsive  to  aromatase  inhibitors 

Type  4  Multihormone-resistant  (resistant  to  all  endocrine  therapies  and  includes  ER-  and  PR-  tumors) 


“Resistance  can  be  considered  as  unresponsiveness  and  antiestrogen-stimulated  phenotypes 
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Table  2  Relative  binding  affinities  (approximate)  of  selected  estro¬ 
gens,  anticstrogcns,  and  environmental  estrogens  and  phytoestrogens® 

Compound  Relative  binding  affinity 

( 17 p-estradiol  =100) 


ERa  ERP 


Estrogens 


Estrone 

60 

37 

Estriol 

14 

21 

Antiestrogens 

Tamoxifen 

7 

6 

4-Hydroxytamoxifen 

178 

339 

Nafoxidine 

44 

16 

ICI  164,384" 

85 

166 

Raloxifene 

69 

16 

Clomiphcnc 

25 

12 

Environmental  estrogens  and  phytoestrogens 

Gcnistcin 

5 

36 

Resveratrol 

<1.1  X  10-" 

<1.6x  10 

Zearalcnol 

7 

5 

o,y-DDE  2(2-chloro-phcnyl)-2- 

<0.01 

<0.01 

(4-chloro  phenyl)  - 1 , 1  dichlorocthylene 
Bisphcnol  A 

0.01 

0.01 

“Adapted  from  Kuiper  et  al  (1998),  Kuiper  et  al  (1997)  and  Bowers 
et  al  (2000);  the  methods  for  estimating  ER  binding  are  not  the  same 
across  these  studies  but  all  three  express  binding  relative  to  the  values 
estimated  for  17/?-cstradiol  ‘'ICI  182,780  is  an  analog  of  ICI  164,384 


diffusion  into  local  tissues.  If  the  affinity  and  capacity  of 
tissue  for  drug/hormone  exceed  that  of  blood,  significant 
accumulation  within  tumors  would  likely  occur.  Data  in 
Table  3  (adapted  from  Clarke  et  al.,  2001b)  illustrate 


several  points  regarding  the  pharmacokinetics  of  estro¬ 
gens  and  antiestrogens.  For  example,  intratumor  con¬ 
centrations  of  both  estradiol  and  TAM  are  much  higher 
than  their  respective  concentrations  in  the  serum.  For 
estrogens,  where  the  primary  estrogen  present  in  tumors 
is  17jS-estradiol,  both  biosynthesis  within  the  tumor  and 
significant  uptake  from  blood  occur. 

The  ability  of  estrogens  and  antiestrogens  to  compete 
for  binding  to  ER  is  likely  to  reflect  intracellular 
availability.  While  their  respective  free  concentrations 
are  largely  unknown,  the  data  in  Tables  2  and  3  imply 
that  many  breast  tumors  should  accumulate  a  sufficient 
excess  of  TAM  and  its  major  antiestrogenic  metabolites 
to  compete  readily  with  intratumor  estrogens.  If  the 
estimate  for  estradiol  concentrations  (1.29  nM)  and  the 
reported  concentrations  for  TAM  and  its  major 
metabolites  (^^3  jUM  TAM  +  '^1  fxu  A-desmethyltamox- 
ifen  H — ^0.2  fiu  4-hydroxytamoxifen)  in  tumors  are  good 
approximations  (Table  3),  antiestrogenic  metabolites 
may  accumulate  to  levels  up  to  lO'^-fold  higher  than 
estradiol.  While  TAM  and  #-desmethyltamoxifen  have 
relative  ER  binding  affinities  about  10%  that  of 
estradiol  (Table  2),  overall,  antiestrogenicity  may  exceed 
estrogenic! ty  in  most  T AM-treated  breast  tumors  by 
100-fold  (assuming  equivalent  availability). 

This  interpretation  is  consistent  with  the  initial 
antiestrogenic  activity  of  TAM  seen  in  most  ER  + 
breast  cancers.  No  compelling  evidence  shows  that 
TAM  becomes  extensively  metabolized  to  purely  estro¬ 
genic  metabolites  in  patients  with  antiestrogen-resistant 
cancer.  Furthermore,  little  evidence  has  been  produced 
to  suggest  that  the  balance  of  TAM  metabolism  is  such 


Table  3  Serum  and  intratumor  estrogen  and  tamoxifen  concentrations® 


Serum  concentrations 

Mean  estimates  of  estrogen  concentrations 
Follicular  phase  Luteal  phase 

<0.28  nM  <l.lnM 

Pregnancy 
<  150  nM 

Breast  cancer  Controls 

0.114nM  0.093  nM 


Comments 

Normal  menstrual  cycle 

Normal  third  trimester  (when  estrogen  concentrations  arc  highest) 

All  postmenopausal  women;  in  most  studies  these  differences 
are  statistically  significant' 


Estimates  of  tamoxifen  concentrations 
Concentration  Drug/metabolite 

<  1 . 1  ^M  Tamoxifen  +  metabolites 

<  4.0  jm  Tamoxifen 

<  6.0  pM  7V-desmethyltamoxifen 


Similar  to  normal  tamoxifen  regimen 
High-dose  tamoxifen  regimen 
High-dose  tamoxifen  regimen 


Intratumor  concentrations 

Mean  estimates  of  estrogen  concentrations  Comments 

Breast  tumors  Non-neoplastic 

l,29nM  0.76 nM  Non-neoplastic  includes  adjacent  normal,  fibroadenomas,  adipose  tissues 


Mean  estimates  of  tamoxifen  concentrations 
Concentration  Drug/metabolite 


<3.0/iM 

<4.0/iM 

<7.0/^m 

<8.0/^m 

<0.2//m 


Tamoxifen 

Tamoxifen 

A^-dcsmcthyltamoxifen 

A^-desmethyltamoxifen 

4-Hydroxytamoxifen 


Mean  estimates  vary  across  studies.  The  values  represented  here  arc 
among  the  higher  of  the  reported  mean  values® 

Breast  tumors 

Brain  metastascs  from  breast  cancer 
Breast  tumors 

Brain  metastascs  from  breast  cancer 
Brain  metastascs  from  breast  cancer 


“See  Clarke  et  al  (2001)  as  to  how  these  values  were  obtained  and  for  citations  to  the  source  publications 
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that  sufficient  concentrations  of  its  estrogenic  metabo¬ 
lites  are  produced  to  overcome  TAM’s  intracellular 
cumulative  antiestrogenicity  (combination  of  parent 
drug  plus  its  antiestrogenic  metabolites)  (Clarke  et  al., 
2001b).  Currently,  no  clinically  relevant  ER  variants/ 
mutants  have  been  described  that  could  adequately 
affect  intratumor  pharmacology  to  an  extent  sufficient 
to  offset  this  balance  in  favor  of  a  TAM-stimulated  or 
other  antiestrogen-resistant  phenotype  in  a  significant 
proportion  of  breast  cancers. 

Changes  in  TAM  influx/efflux  could  alter  its  intracel¬ 
lular  concentrations,  and  limited  evidence  suggests  that 
this  may  occur  in  some  tumors.  However,  the  extent  to 
which  it  occurs  and  the  mechanisms  driving  such 
changes  are  unclear  (Clarke  et  al.,  2001b). 

Exogenous  estrogenic  exposures  and  their  effects  on 
antiestrogen  resistance 

Since  estrogens  compete  with  antiestrogens  for  ER 
binding,  any  compound  with  either  estrogenic  activity  or 
the  ability  to  increase  estrogen  exposure  could  affect 
response  to  antiestrogens.  Estrogenic  exposures  come  in 
many  forms,  including  plant  and  environmental  estro¬ 
gens  (Hilakivi-Clarke  et  al.,  1999b;  Clarke  et  al.,  2001a), 
dietary  exposures  that  affect  the  levels  of  endogenous 
estrogens  (Hilakivi-Clarke  et  al.,  1997),  and  estrogenic 
HRT  (Clarke  et  al.,  2001b).  Dietary  antioxidant 
exposure  also  may  affect  antiestrogen  responsiveness 
(Clarke  et  al,  2001b)  and  some  women  already  take  the 
most  potent  natural  antioxidant  (vitamin  E)  as  an 
alternative  medicine  for  controlling  menopausal  symp¬ 
toms  (Stampfer  et  al,  1993;  Barton  et  al,  1998;  Koh 
et  al,  1999). 

The  inclusion  of  women  on  HRT  in  some  of  the 
chemoprevention  trials  has  been  one  of  the  issues  raised 
to  explain  the  lack  of  TAM’s  activity  in  these  trials.  It  is 
unlikely  that  HRT  would  raise  serum  estrogens  beyond 
levels  seen  in  TAM  responsive  premenopausal  women. 
However,  the  nature  of  the  estrogenic  exposure  is  very 
different  between  postmenopausal  women  on  HRT  and 
premenopausal  women.  More  data  are  required  to 
assess  directly  the  contribution  of  HRT  to  TAM 
responsiveness. 

Dietary  exposures  and  tamoxifen  activity 

Several  dietary  components,  including  those  present  in 
dietary  fats,  soy,  fruits,  vegetables,  and  alcohol,  have 
been  suggested  to  have  either  protective  or  harmful 
effects  on  the  breast.  Some  of  these  dietary  factors,  such 
as  dietary  fats  and  soy,  can  alter  circulating  estrogen 
levels  (Lu  et  al,  2000)  and  interact  with  ER  (Wang  et  al, 
1996b;  Collins  et  al,  1997;  Zava  and  Duwe,  1997). 
TAM’s  ability  to  affect  the  growth  of  ER  -{-  tumor  cells 
may  be  altered  by  dietary  intakes  of  fats  and  soy.  Fats, 
soy,  and  other  dietary  components  also  modify  other 
cell  signaling  pathways  (Agarwal,  2000;  Bouker  and 
Hilakivi-Clarke,  2000;  Clarke  et  al,  2002).  If  TAM 
signals  through  the  same  pathways,  a  dietary  factor 
might  modify  TAM’s  ability  to  inhibit  the  growth  of 


malignant  breast  cells  (ER-dependent  or  -independent 
interactions).  Dietary  components  that  alter  signaling  of 
a  pathway  that  affects  tumor  growth  independent  of 
TAM  also  could  either  potentiate  or  reverse  TAM’s 
effects.  Data  from  both  in  vitro  and  in  vivo  studies 
strongly  support  the  hypothesis  that  at  least  some 
dietary  factors  modify  TAM’s  actions  in  the  breast. 

Soy,  dietary  fat,  vegetables,  and  antiestrogen 
responsiveness 

High  soy  protein  intake  has  been  proposed  to  contribute 
to  low  breast  cancer  incidence  among  Asian  women 
(Adlercreutz,  1995).  A  recent  meta-analysis  shows  that  a 
high  intake  of  soy  is  associated  with  a  reduced  risk  of 
developing  premenopausal,  but  not  postmenopausal, 
breast  cancer  (Trock  et  al,  2001).  Soybeans  contain 
large  amounts  of  the  isoflavones  daidzein  and  genistein 
(Barnes  et  al,  1994;  Adlercreutz,  1995).  Genistein  has 
many  biological  effects  that  could  potentially  reduce 
breast  cancer  risk,  including  inhibition  of  tyrosine 
kinase,  EGFR  tyrosine  phosphorylation,  and  topoi- 
somerase  II  activities.  It  also  arrests  cell  cycle  progres¬ 
sion  at  G2“M,  induces  apoptosis,  has  antioxidant 
properties,  modifies  eicosanoid  metabolism,  and  inhibits 
in  vitro  angiogenesis  (see  the  review  by  Messina  et  al, 
1994).  While  each  of  these  actions  of  genistein  could 
influence  antiestrogen  responsiveness,  they  occur  pri¬ 
marily  at  pharmacologic  rather  than  physiologic  ex¬ 
posures.  Humans  consuming  high  levels  of  soy-based 
food  products  have  less  than  1  pM  of  circulating 
genistein  (Messina  et  al,  1994),  and  30-185  jum  genistein 
is  required  to  induce  many  of  the  above-mentioned 
effects  in  experimental  models  in  vitro  where  bioavail¬ 
ability  is  already  likely  to  be  greater  than  in  vivo. 

At  physiological  concentrations,  genistein  exhibits 
estrogenic  properties  that  could  enhance  breast  cancer 
risk.  Genistein  activates  the  ER  (Wang  et  al,  1996b; 
Collins  et  al,  1997;  Zava  and  Duwe,  1997)  and  induces 
proliferation  of  human  breast  cancer  cells  in  vitro 
(Martin  et  al,  1978;  Wang  et  al,  1996b).  Genistein  also 
stimulates  proliferation  of  mammary  epithelial  cells  in 
rodents  (Santell  et  al,  1997;  Hsieh  et  al,  1998)  and  in 
women  (Petrakis  et  al,  1996;  McMichael-Phillips  et  al, 
1998).  Data  from  ovariectomized  athymic  mice,  repre¬ 
senting  a  model  of  postmenopausal  breast  cancer,  show 
that  genistein  and  soy  protein  isolate  both  promote  the 
growth  of  MCF-7  xenografts  (Allred  et  al,  2001). 
Furthermore,  a  recent  study  in  athymic  mice  showed 
that  genistein  blocked  the  inhibitory  effect  of  TAM  on 
the  growth  of  MCF-7  xenograft  (Ju  et  al,  2002).  These 
results  suggest  caution  in  consuming  high  levels  of 
genistein  among  postmenopausal  women  who  are  taking 
TAM  for  their  breast  cancer  or  to  reduce  their  risk  of 
developing  breast  cancer. 

Very  little  is  known  about  possible  interactions 
between  high  dietary  fat  intake  and  the  activity  of 
TAM.  TAM  has  beneficial  effects  on  some  aspects  of 
fatty  acid  metabolism,  for  example,  by  reducing 
cholesterol  levels  (Reckless  et  al,  1997).  Diets  contain¬ 
ing  n-3  PUFAs  can  increase  the  efficacy  of  cytotoxic 
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drugs  against  ER-  human  breast  cancer  xenografts 
(MDA-MB-231)  (Hardman  et  al.,  2001).  A  recent  study 
suggests  that  n-3  PUFAs  restore  TAM’s  ability  to 
inhibit  cell  growth  (DeGraffenried  et  al.,  2003).  Oleic 
acid  appears  to  affect  indirectly  TAM’s  dissociation 
from  cellular  antiestrogen  binding  sites  (Hwang,  1987), 
an  effect  that  could  increase  the  intracellular  concentra¬ 
tions  of  free  drug.  Since  n-3  PUFAs  have  many 
biological  activities,  they  may  play  a  role  in  modifying 
TAM’s  actions,  including  an  ability  to  inhibit  protein 
kinases  (Mirnikjoo  et  al.,  2001).  y-linolenic  acid  has 
several  properties  that  might  make  it  antitumorigenic. 
Kenny  et  al.  (2001)  have  shown  that  y-linolenic  acid 
reduces  the  growth  of  MCF-7  xenografts,  reduces  ER 
levels  in  these  cells,  and  potentiates  TAM’s  ability  to 
inhibit  cell  growth.  However,  the  precise  mechanism  of 
action  of  y-linolenic  acid  remains  to  be  determined. 

Cruciferous  vegetables,  such  as  broccoli,  cabbage, 
cauliflower,  and  brussel  sprouts  contain  high  levels  of 
indole-3-carbinol  (I3C)  and  its  metabolite  3,3-diindoly- 
methane  (DIM).  These  compounds  have  been  shown  to 
exhibit  chemopreventive  activity  in  multiple  target 
organs  including  the  breast  (Bradlow  et  al.,  1999). 
Several  mechanisms  of  action  have  been  proposed  for 
I3C  and  DIM,  including  changes  in  phase  I  and  II 
enzyme  activities  and  in  cell  cycle  progression.  Data 
from  Katchamart  and  Williams  (2001)  show  that  I3C 
and  DIM  downregulate  the  expression  of  the  cyto¬ 
chrome  P-450  components  that  convert  TAM  to  its 
more  potent  metabolites.  Thus,  these  authors  propose 
that  high  intake  of  cruciferous  vegetables  might  reduce 
TAM  efficacy.  Vitamin  A/retinoids  can  interact  with 
estrogens,  and  some  studies  suggest  that  retinoids  can 
increase  the  activity  of  TAM  (McCormick  and  Moon, 
1986;  Anzano  et  al,  1994).  Little  evidence  from  human 
studies  exists  to  support  directly  this  interaction. 
However,  remarkably  few  studies  have  been  undertaken 
in  this  area  and  additional  data  are  clearly  needed. 


Estrogen  receptors  and  antiestrogen  resistance 

Two  ER  genes  have  been  identified:  the  classical  ERa  on 
human  chromosome  6q25.1  and  ERp  on  chromosome 
14q22-25.  Each  receptor  acts  as  a  nuclear  transcription 
factor  that  binds  responsive  elements  (estrogen  respon¬ 
sive  elements;  EREs)  within  the  promoters  of  target 
genes  (Figure  la)  or  binds  to  other  proteins  and  affects 
their  abilities  to  regulate  transcription  (e.g.,  AP-1,  SP-1; 
Figure  lb).  ERa  and  ERj5  homology  is  limited  in  the 
transcriptional  regulatory  domains,  particularly  in  the 
N-terminal  region.  Both  ER  homodimers  and  hetero¬ 
dimers  are  formed  and  these  may  differ  in  their  ability  to 
affect  transcription  at  some  promoters  (Tyulmenkov 
et  al.,  2000).  For  example,  the  ER  binds  directly  to 
EREs,  which  are  broadly  defined  consensus  sequences 
with  some  tolerance  to  variation  in  their  sequence.  ER 
also  binds  to,  and  regulates  the  transcriptional  activa¬ 
tion  of,  other  transcription  factors  including  AP-1,  SP-1, 
and  at  cyclic  AMP  response  elements  (CRE)  (Paech 
et  al.,  1997;  Castro-Rivera  et  al.,  2001;  Liu  et  al,  2002b). 


b 


SP-1 

Figure  1  Estrogen  receptor  (ER)  function — a  simplistic  represen¬ 
tation,  ERs  function  as  nuclear  transcription  factors,  bound  to 
either  estrogen  responsive  elements  (a)  or  to  proteins  bound  to 
other  responsive  elements,  for  example,  AP-I,  SP-1  (b).  Transcrip¬ 
tion  can  be  induced  or  repressed,  with  the  pattern  of  genes  affected 
likely  reflecting  the  mix  of  coregulators  available  to  bind  to  the 
various  ER-transcription  complexes  formed  on  respective  promo¬ 
ters.  Evidence  for  both  ligand-dependent  and  -independent 
activation  exists,  and  it  is  clear  that  different  ligands  can  induce 
different  conformations  in  the  bound  ER  proteins.  ER  =  estrogen 
receptor;  in  (a)  the  hatched  elipse  represents  a  coregulator;  in  (b) 
the  split  elipse  represents  a  protein  complex  such  as  AP-1  or  SP-1 


The  patterns  of  ER  expression  vary  in  the  mammary 
gland.  In  most  normal  mammary  epithelia,  the  two 
receptors  are  rarely  expressed  in  either  a  high  proportion 
of  cells  or  at  very  high  levels.  The  ERa :  ERj5  ratio  may 
change  during  carcinogenesis,  such  that  the  ERa 
proportion  increases  as  the  cells  acquire  a  more 
progressed  phenotype.  Whether  this  change  reflects  an 
increase  in  ERa  or  a  decrease  in  ERjS  expression 
(Leygue  et  al,  1998),  and  whether  it  is  a  function  or  a 
consequence  of  malignant  transformation  or  progres¬ 
sion  is  unclear.  ERa  appears  to  be  the  more  highly 
expressed  of  the  two  receptors  in  breast  tumors  (Leygue 
et  al,  1998;  Speirs  et  al,  1999a),  at  least  when  both  are 
coexpressed  in  the  same  cells  (Saunders  et  al,  2002). 
However,  some  of  the  few  existing  studies  that  measured 
both  ERa  and  ER^  proteins  have  been  complicated  by 
the  use  of  different  antibodies  of  occasionally  uncertain 
quality  (Speirs,  2002). 

When  occupied  by  estradiol,  ERa  and  ER)?  can 
produce  similar  effects  on  gene  regulation  in  simple 
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ERE-driven  reporter  construct  studies  (Kuiper  et  ai, 
1996).  However,  the  ligand  binding  profiles  of  the  two 
receptors  may  be  species  specific  (Harris  et  ai,  2002). 
Furthermore,  at  other  promoters,  the  two  receptors 
have  very  different  activities.  For  example,  ERa  and 
ER^  have  opposite  effects  on  transcription  driven  by 
AP-1,  SP-1,  or  CRE  sites  in  promoter-reporter  assays 
(Paech  et  al.,  1997;  Castro- Rivera  et  al.,  2001; 
Maruyama  et  al.,  2001a;  Liu  et  al.,  2002b).  Differential 
regulation  of  cyclin  D1  by  ERa  and  ERjS  has  been 
reported  (Liu  et  al.,  2002b),  and  ER/?  can  block  the 
transcriptional  activation  of  AP-1  by  ERa  (Maruyama 
et  al.,  2001b).  Changes  in  ER  expression/activation 
might  be  important  in  affecting  endocrine  responsive¬ 
ness  if  genes  driven  primarily  by  AP-1,  SP-1,  and/or 
CRE  elements  are  rate  limiting  in  affecting  signaling  to 
apoptosis/proliferation/survival. 

The  relative  importance  of  ERa  and  ERj5  in  affecting 
antiestrogen  responsiveness  remains  to  be  established. 
However,  the  extensive  existing  data  with  well  char¬ 
acterized  ERa  antibodies  that  do  not  recognize  ER^ 
allow  for  some  speculation.  Ligand  binding  ER  assays 
(do  not  differentiate  between  ERa  and  ERj?)  and 
immunohistochemical  detection  of  ER  in  patients’ 
tumors  (detect  ERa  only)  broadly  agree  in  their 
determination  of  ER-positivity  and  prediction  of  TAM 
sensitivity  (Alberts  et  al.,  1996;  Molino  et  al,  1997). 
Thus,  whatever  the  role  of  ER^,  measuring  ERa  is 
sufficient  to  predict  whether  or  not  a  patient  is  likely  to 
benefit  from  treatment  with  antiestrogen,  aromatase 
inhibitor,  or  ovariectomy.  These  findings  also  would  be 
consistent  with  a  requirement  of  ERa  for  antiestrogen 
sensitivity,  which  is  further  consistent  with  data  from 
most  experimental  models  in  which  ERa  is  usually  the 
dominant  ER  isoform  expressed. 

Since  loss  of  ERa  (i.e.,  the  tumor  phenotype  changes 
from  ERa-h  to  ERa~)  is  relatively  uncommon  as  an 
acquired  antiestrogen  resistance  mechanism,  it  seems 
unlikely  that  many  resistant  tumors  acquire  a  true 
ERa”/ER^  H-  phenotype.  If  there  is  a  role  for  ERjS,  it 
may  be  driven  by  changes  in  its  expression  level  relative 
to  ERa,  since  heterodimers  are  functionally  important 
(Pettersson  et  al.,  1997;  Tyulmenkov  et  al.,  2000).  When 
introduced  into  ER-  MDA-MB-231  breast  cancer  cells, 
ER^  produces  ligand-independent  inhibition  of  prolif¬ 
eration,  whereas  ERa-mediated  effects  are  ligand- 
dependent  (Lazennec  et  al.,  2001).  A  ligand-independent 
suppression  of  growth  by  ERjS  might  confer  a  multi- 
hormone-resistant  phenotype  (Schinkel  et  al.,  1991) 
(multihormone  resistance  is  Type  4  resistance  as  shown 
in  Table  1),  since  ICI  164,384  could  not  block  the 
ligand-independent  effect  of  ER  expression  in  MDA- 
MB-231  cells  (Lazennec  et  al.,  2001). 

Currently,  determining  the  relative  importance  of 
ERj8  expression  in  antiestrogen  responsiveness  is  limited 
by  the  lack  of  adequate  data  regarding  ERj?  protein 
expression  in  responsive  and  resistant  breast  tumors. 
The  possible  association  of  ERj?  mRNA  expression  with 
a  poor  prognosis  (Dotzlaw  et  al.,  1999;  Speirs  et  al., 
1999b)  may  further  complicate  matters.  Only  one  small 
study  {n  =  9  TAM  resistant;  «  =  8  TAM  responsive 


tumors)  has  explored  the  association  of  ERj?  expression 
with  antiestrogen  resistance.  The  authors  reported 
increased  ERj?  mRNA  expression  in  antiestrogen- 
resistant  tumors  (Speirs  et  al.,  1999a).  Nonetheless,  the 
outcome  is  potentially  confounded  by  the  very  small 
number  of  cases,  the  fact  that  only  ER^  mRNA  was 
measured,  and  the  possible  association  of  ERj?  expres¬ 
sion  with  a  more  aggressive  phenotype  (Dotzlaw  et  al., 
1999;  Speirs  et  al.,  1999b). 

Several  mutant  and  splice  variant  forms  of  both  ERa 
and  ERj?  have  been  reported  and  previously  reviewed 
(Hopp  and  Fuqua,  1998;  Murphy  et  al,  1998). 
Compelling  evidence  that  any  of  these  are  functionally 
relevant  in  driving  a  significant  proportion  of  breast 
cancers  remains  largely  unconvincing.  For  example, 
most  data  only  measure  mutant  mRNAs  that  may  not 
be  translated  into  biologically  relevant  protein  concen¬ 
trations  in  cells.  Most  tumors  that  express  mutant  ER 
concurrently  express  the  wild-type  receptor,  with  the 
mutant  representing  a  relatively  small  proportion  of 
total  ER.  A  mutant  ERa  (D351Y)  that  perceives  TAM 
as  an  agonist  has  been  described  in  some  TAM- 
stimulated  MCF-7  cell  variants  (Jiang  et  al,  1992). 
Similarly,  changes  in  the  F-region  of  the  receptor  also 
can  affect  the  activities  of  estradiol  and  4-hydroxyta- 
moxifen  (Schwartz  et  al,  2002).  The  agonist  activities  of 
raloxifene  are  also  increased  in  D351Y  (Liu  et  al, 
2002a).  Expression  of  this  mutant  in  breast  tumors  in 
patients  has  not  been  reported.  Thus,  the  clinical 
relevance  of  this  ER  mutant  or  functionally  similar 
ER  mutant  proteins  remains  unclear.  However,  our 
understanding  of  the  role  of  ER  mutants  and  variants 
may  change  in  the  near  future  (Fuqua,  2001).  Currently, 
little  compelling  evidence  exists  in  support  of  mutant  or 
splice  variant  ERa  and/or  ERj?  contributions  to  either 
de  novo  or  acquired  antiestrogen  resistance  or  hormone 
independence  (Kamik  et  al,  1994;  LeClercq,  2002). 
However,  the  importance  of  receptor  mutations  and 
varinats  in  other  diseases  suggests  that  a  role  for  these 
modifications  of  ERs  may  yet  be  shown  to  be  important. 

Coregulators  of  estrogen  receptor  function  and 
antiestrogen  resistance 

Whatever  the  ERE  and/or  other  transcription  factor 
bound,  the  ability  to  affect  transcription  of  a  target  gene 
is  further  modified  by  multiple  components  of  the 
transcription  complex.  Perhaps  the  most  widely  studied 
modifiers  of  ER-mediated  transcription  are  the  coregu¬ 
lators.  Coregulators  can  be  either  coactivators  (indu¬ 
cers)  or  corepressors  (inhibitors)  of  gene  transcription. 
These  molecules  often  act  by  altering  histone  acetylation 
(Kim  et  al,  2001).  While  most  studies  of  coregulator 
action  have  been  carried  out  with  ERa,  ERj?  function  is 
also  affected  (Tremblay  et  al,  1998),  as  is  the  activity  of 
other  members  of  the  steroid  hormone  receptor  super¬ 
family. 

ER  coregulators  in  several  protein  families  have  been 
described  in  recent  years,  almost  all  of  which  are 
ubiquitously  expressed  (Graham  et  al,  2000)  and 
defined  initially  by  their  ability  to  affect  ER-mediated 
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transcription  in  simple  promoter-reporter  transcription 
assays.  Considerable  redundancy  is  evident,  with  many 
coactivators  or  corepressors  exhibiting  similar  transcrip¬ 
tion  regulatory  effects  in  comparable/identical  biologi¬ 
cal  assays.  A  full  understanding  of  the  role  of 
coregulators  may  be  further  complicated  by  gene 
promoter-,  tissue-,  and  species-specific  effects,  all  of 
which  contribute  to  the  cellular  context.  Thus,  the 
pattern  of  other  proteins  expressed  in  a  cell  (cellular 
context)  may  greatly  influence  how  and  whether  a 
specific  coregulator  is  the  dominant  effector  in  regulat¬ 
ing  a  ligand’s  ability  to  affect  ER-mediated  transcription 
(Clarke  and  Brunner,  1996;  Clarke  et  al,,  2001b). 

The  ability  of  an  ER-driven  transcription  complex  to 
recruit  coregulators  can  be  strongly  ligand-dependent. 
For  example,  4-hydroxytamoxifen  induces  a  conforma¬ 
tion  that  blocks  the  coactivator  recognition  groove  in 
ER  (Shiau  et  al,  1999).  Estrogens  and  antiestrogens 
have  long  been  known  to  affect  the  physical  properties 
of  ERs  (Miller  et  al,  1984).  The  importance  of  ligand  to 
receptor  conformation  and  activation  led  to  early 
conceptual  models  that  have  received  renewed  attention 
in  recent  years.  Perhaps  the  most  important  information 
has  come  from  crystallographic  studies  of  the  ER 
binding  domain  complexed  with  different  ligands 
(Brzozowski  et  al,  1997;  Pike  et  al,  1999;  Shiau  et  al, 
2002).  Several  laboratories  have  used  these  data  to 
describe  conceptually  similar  models  of  ER  function 
when  liganded  with  either  agonists  or  antagonists 
(Wurtz  et  al,  1998;  Pike  et  al,  1999;  Liu  et  al,  2002a, 
Shiau  et  al,  2002).  The  major  limitations  of  such  studies 
are  the  use  of  only  the  ligand  binding  domain  (requires 
the  assumption  that  no  other  domains  of  the  ER  affect 
its  structure)  and  the  use  of  crystal  structures  that  may 
or  may  not  fully  reflect  receptor  structure  in  the  more 
complex  environment  of  a  living  cell.  Nonetheless,  data 
from  such  studies  can  provide  important  molecular 
insights  into  important  biological  responses. 

The  consequences  of  ligand-specific  ER  conforma¬ 
tions  are  becoming  evident  but  may  be  complex 
(McKenna  et  al,  1999).  The  coactivator  SRC-1 
produces  a  ligand-independent  activation  of  ER  while 
enhancing  the  agonist  activity  of  the  potent  TAM 
metabolite  4-hydroxytamoxifen  (Smith  et  al,  1997). 
SRC-1  also  interacts  synergistically  with  CRE  binding 
proteins  in  regulating  ER-mediated  transcription  (Smith 
et  al,  1996).  SMRT  (corepressor)  binds  ER  and  blocks 
the  agonist  activity  of  4-hydroxytamoxifen  induced  by 
SRC-1  (Smith  et  al,  1997).  N-CoR  is  a  corepressor  that 
binds  TAM-occupied  but  not  ICI  182,780-occupied  ER 
(Jackson  et  al.,  1997).  The  functional  relevance  of  this 
latter  observation  is  consistent  with  the  lack  of  full 
crossresistance  between  these  two  drugs  in  cell  cultures 
models  (Brunner  et  al,  1993b)  and  in  breast  cancer 
patients  (Howell  et  al,  1995;  Robertson,  2001).  How¬ 
ever,  a  recent  study  found  no  association  between 
N-CoR  expression  and  outcome  in  TAM-treated 
patients  (Osborne  et  al,  2002). 

It  might  be  expected  that  increased  expression  or 
function  of  a  protein  that  allows  an  antiestrogen  to  act 
as  an  agonist,  or  decreased  expression  of  a  coregulator 


that  suppresses  ER  activity  when  the  receptor  is 
occupied  by  an  antiestrogen,  could  confer  a  degree  of 
antiestrogen  resistance  (Clarke  and  Brunner,  1996; 
Clarke  et  al,  2001b).  Evidence  for  this  in  human 
cancers  and  experimental  models  remains  somewhat 
limited.  Expression  of  the  corepressor  N-CoR  is  lower  in 
TAM-stimulated  MCF-7  xenografts  than  in  wildtype 
xenografts  (Lavinsky  et  al,  1998),  but  the  functional 
relevance  of  the  observation  in  human  cancers  is 
unclear.  Chan  et  al  (1999)  studied  a  small  cohort  of 
TAM-resistant  human  breast  tumors  (n  =  \9)  but  found 
no  difference  in  the  expression  of  TIF-1,  RIP140,  or  the 
corepressor  SMRT.  Lower  levels  of  the  coactivator 
SUG-1  were  detected  in  some  TAM-resistant  tumors, 
but  the  consequences  for  antiestrogen  responsiveness  of 
reduced  SUG-1  expression  require  further  study. 

Extrapolating  many  of  these  observations  to  specific 
biological  functions  in  breast  tumors  is  not  always  a 
simple  matter.  For  example,  most  data  have  been 
obtained,  of  necessity,  from  the  use  of  somewhat 
artificial  experimental  models  with  simple  promoter 
conformations.  ERE  structure  is  variable  across  known 
estrogen-regulated  genes,  and  a  promoter’s  ability  to 
bind  ERs  and  coregulators  can  be  affected  by  its  local 
structure  (Truss  and  Beato,  1993;  Nardulli  et  al,  1995; 
Lee  and  Lee,  2001).  Different  ER-antiestrogen  com¬ 
plexes  also  may  recognize  different  promoter  elements 
(Yang  et  al,  1996).  Thus,  promoter  context  is  likely  to 
be  important  (Clarke  and  Brunner,  1996).  Given  the 
evidence  of  considerable  coregulator  redundancy  and 
ubiquitous  expression  (McKenna  et  al,  1999;  Planas- 
Silva  et  al,  2001;  McKenna  and  O’Malley,  2002),  it  is 
unclear  whether  measuring  or  affecting  changes  in  the 
expression/function  of  any  single  coregulator  will  prove 
clinically  useful.  Attempting  to  affect  resistance  by 
modifying  the  expression  of  any  single  coregulator 
could  be  confounded  by  compensatory  responses  in 
other  coregulators,  as  likely  happens  for  mammary 
gland  development  in  SRC-1  (Xu  et  al,  1998)  and  E6- 
AP  null  mice  (Smith  et  al,  2002).  A  greater  degree  of 
specificity  will  likely  be  obtained  by  targeting  specific 
genes  within  a  functionally  relevant  gene  network 
(Clarke  and  Brunner,  1996),  which  would  be  down¬ 
stream  of  any  coregulator  activities.  The  overall  balance 
in  the  patterns  and  levels  of  expression  of  coactivators 
and  coregulators  also  likely  contributes  to  ER  signaling 
and  endocrine  responsiveness.  Clearly,  cellular  context  is 
critical  in  assessing  the  role  of  specific  coregulators  in 
affecting  a  given  phenotype  (Clarke  and  Brunner,  1996; 
Clarke  et  al,  2001b). 

In  summary,  with  such  redundancy  and  apparent  lack 
of  cell/tissue  specificity,  measuring  the  expression  of 
specific  coregulators  to  predict  an  antiestrogen-resistant 
phenotype  may  be  uninformative,  and  affecting  changes 
in  the  expression/function  of  any  single  coregulator  to 
alter  phenotype  may  prove  difficult.  We  still  do  not 
know  with  any  certainty  which  estrogen-regulated  genes 
are  responsible  for  affecting  cell  proliferation,  cell 
survival,  or  apoptosis  in  breast  cancer.  Hence,  we  do 
not  know  the  structure  of  their  promoters,  the 
coregulators  their  occupied  receptors  can  recruit  into 
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functional  or  inactive  transcription  complexes,  or  the 
cellular  context  in  which  they  exist  in  responsive  and 
resistant  cells. 


Estogen  receptor^ndependent  cell  signaling  in 
antiestrogen  resistance 

Only  a  small  proportion  of  ER-/PR-  tumors  respond 
to  antiestrogens,  consistent  with  their  primary  actions 
being  mediated  by  ER.  Nonetheless,  many  investigators 
have  explored  ER-independent  signaling  as  mechanisms 
of  antiestrogen  resistance.  The  primary  role  of  these 
effects  is  unclear  and  some  occur  at  concentrations  that 
are  not  pharmacologically  relevant.  Nonetheless,  such 
activities  can  alter  ER  function  or  may  interact  with 
signaling  downstream  of  ER  (Figure  2).  Since  these 
mechanisms  have  been  reviewed  in  detail  (Clarke  et  ai, 
2001b),  we  now  only  briefly  discuss  some  of  the  more 
relevant. 

Antiestrogen-induced  induction  of  oxidative  stress 
responses  is  perhaps  the  most  widely  studied  ER- 
independent  mechanism.  The  redox  metabolism  of 
several  TAM  metabolites  can  give  rise  to  reactive 
species  that  can  induce  oxidative  stress  (Y e  and  Bodell, 
1996),  and  both  TAM  and  4-hydroxytamoxifen  produce 
8-hydroxy-2'deoxyguanosine  (Okubo  et  al,,  1998). 
TAM’s  ability  to  induce  quinone  reductase  (Montano 
and  Katzenellenbogen,  1997),  protein  kinase  C  redis¬ 
tribution  (Gundimeda  et  al.,  1996),  and  lipid  peroxida¬ 
tion  (Schiff  et  al,  2000),  and  our  observations  that 
antiestrogen-resistant  cells  upregulate  cytochrome 
c  oxidases  (Gu  et  al,  1997)  and  NF/cB  (Gu  et  al, 
2002)  also  are  consistent  with  antiestrogen  effects  on 
oxidative  stress  responses  (reviewed  by  Clarke  et  al, 
2001b). 

Other  ER-independent  effects  include  perturbations 
in  membrane  structure  (Clarke  et  al,  1990b),  changes  in 
protein  kinase  C  activation  and  subcellular  localization 
(O’Brian  et  al,  1986;  Gundimeda  et  al,  1996),  and 


ER  independent  events 


(SP-l) 

Figure  2  Putative  role  of  estrogen  receptor-independent  effects  of 
steroids  and  antiestrogens.  These  activities  arc  induced  by 
hormones  or  antihormoncs  that  are  not  directly  mediated  by  their 
interactions  with  ERs.  Such  effects  may  be  necessary,  but  they  arc 
not  generally  sufficient,  to  elicit  a  proliferative/antiproliferative 
response  at  most  physiologically  or  pharmacologically  relevant 
concentrations.  ER-independent  events  may  affect  ER  signaling 
either  by  altering  ER  activation  and/or  regulating  the  expression/ 
function  of  other  gencs/proteins  that  are  induced/repressed  down¬ 
stream  of  directly  ER-regulated  transcriptional  events.  The  hatched 
elipse  represents  a  coregulator;  ®  =  phosphorylation 


inhibition  of  the  intracellular  Ca+^  binding  protein 
calmodulin  (Rowlands  et  al,  1995).  Some  of  these 
effects  may  be  inter-related,  since  inhibition  of  protein 
kinase  C  also  blocks  calmodulin-dependent  EGFR 
transactivation  (Tebar  et  al,  2002).  These  latter 
mechanisms  may  arise  independent  of  ER,  but  would 
affect  ER-mediated  signaling.  Calmodulin  has  been 
implicated  as  a  coregulator  of  ER  action  (Biswas  et  al, 
1998),  and  EGFR-mediated  signaling  through  MAPK 
may  affect  ER  activation  (see  for  recent  reviews  Clarke 
et  al,  2001b;  Santen  et  al,  2002). 

The  extent  to  which  these  mechanisms  are  truly  ER- 
independent,  in  that  they  do  not  affect  any  aspect  of 
ER-mediated  signaling,  requires  further  study.  As  with 
TAM’s  effects  on  calmodulin,  ER-independent  interac¬ 
tions  may  have  significant  effects  on  ER  activation  and 
function.  For  example,  several  growth  factors  appear  to 
be  able  to  activate  ER  through  the  induction  of  MAPK 
activities  capable  of  changing  ER’s  phosphorylation 
status  (Clarke  et  al,  2001b;  Santen  et  al,  2002).  Other 
ER-independent  events  may  interact  with  ER-mediated 
signaling  downstream  of  ER  activation.  Despite  these 
many  activities,  ER  expression  is  required  for  most  cells 
to  respond  to  antiestrogens.  While  the  importance  of 
ER-independent  signaling  is  unclear,  many  such  signals 
may  be  necessary  but  not  sufficient  for  affecting 
antiestrogen  responsiveness  (Clarke  et  al,  2001b). 


Antiestrogens,  apoptosis,  and  cell  death 

Antiestrogenic  exposures  produce  a  Go/Gi  cell  cycle 
arrest  (Taylor  et  al,  1983),  whereas  estrogenic  exposures 
are  primarily  mitogenic  and  increase  the  proportion  of 
cells  in  S  and  G2/M  while  reducing  the  proportion  in 
Go/Gi.  Such  effects  are  generally  consistent  with  a 
cytostatic  rather  than  cytotoxic  effect.  However,  in  our 
experience,  long-term  selection  against  antiestrogens  in 
vitro  or  prolonged  estrogen  withdrawal  from  estrogen- 
dependent  cells  also  induces  cell  death.  Similar  effects 
are  seen  in  animal  models.  These  observations  are 
consistent  with  the  ability  of  antiestrogens  to  reduce  the 
incidence  of  ER  +  breast  cancers  in  high-risk  women 
(chemoprevention)  and  produce  an  overall  survival 
benefit  in  breast  cancer  patients  (treatment).  Initially, 
antiestrogens  may  produce  a  cytostatic  effect  that,  in  the 
longer  term,  results  in  cell  death. 

The  precise  mechanisms  signaling  to  and  responsible 
for  antiestrogen-induced  cell  death  are  not  fully  under¬ 
stood.  Most  studies  are  consistent  with  an  induction  of 
an  apoptotic  or  programmed  cell  death  (Kyprianou 
et  al,  1991;  Huovinen  et  al,  1993;  Zhang  et  al,  1999). 
However,  many  breast  cancers  that  acquire  antiestrogen 
resistance  still  respond  well  to  cytotoxic  drugs,  many  of 
which  also  signal  to  apoptosis  (Wang  et  al,  1996a).  Such 
effects  could  not  occur  if  the  machinery  for  inducing 
apoptosis  was  no  longer  intact  or  functional.  Thus,  the 
effects  of  antiestrogens  must  be  upstream  of  effector 
mechanisms  and  reflect  subtle  changes  in  how  ERs 
affect  signaling  to  apoptosis.  Other  signaling  pathways 
also  may  be  important.  Data  from  a  recent  study  suggest 
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that  adjacent  normal  mammary  cells  can  induce  cell 
death  through  Fas  signaling  in  breast  cancer  cells. 
Resistance  to  this  effect  in  some  breast  cancer  cells  was 
restored  by  inhibition  of  NFkB  and  PI3  kinase  (Toillon 
et  al.,  2002). 


Tamoxifen-stimulated  phenotype  in  antiestrogen 
resistance 

While  antiestrogens  can  induce  growth  arrest  and 
apoptosis,  in  some  patients,  initiation  of  TAM  therapy 
is  associated  with  rapid  progression  of  their  disease, 
although  continuation  of  TAM  generally  produces  a 
beneficial  response  (Plotkin  et  al,  1978;  Clarysse,  1985). 
This  response  is  called  ‘tumor  flare’  and  is  generally 
attributed  to  the  estrogenic  properties  often  seen  with 
low  doses  of  TAM.  TAM  takes  approximately  4  weeks 
to  reach  effective  steady-state  levels,  producing  a 
window  in  which  patients  are  exposed  to  suboptimal 
and  potentially  estrogenic  concentrations  of  TAM 
(Buckely  and  Goa,  1989;  Etienne  et  al.,  1989).  These 
tumors  are  clearly  not  resistant  to  TAM,  in  either  the 
pharmacologic  or  clinical  context.  Tumor  flare  should 
not  be  confused  with  the  clinical  TAM-stimulated 
resistance  phenotype  that  may  occur  after  prolonged 
TAM  exposure  and  an  initial  TAM  response. 

Unlike  tumor  flare  in  previously  untreated  patients, 
evidence  from  MCF-7  human  breast  cancer  xenografts 
suggests  that  some  breast  cancers  may  be  initially 
growth  inhibited  by  TAM,  only  to  later  become 
dependent  on  TAM  for  proliferation  (Osborne  et  al, 
1987;  Gottardis  et  al,  1989;  Connor  et  al,  2001).  These 
xenografts  also  retain  the  ability  to  be  stimulated  by 
estrogens  (remain  estrogen-dependent).  Pharmacologi¬ 
cally,  this  phenotype  is  not  a  resistance  phenotype 
because  the  cells  are  clearly  responding  to  the  drug. 
However,  a  TAM-stimulated  phenotype  would  repre¬ 
sent  clinical  drug  resistance  because  the  nature  of  the 
response  has  changed  in  a  manner  that  supports  disease 
progression  and  would  require  a  change  in  treatment. 
Acquired  TAM  dependence  appears  to  reflect  a  switch 
in  how  the  cells  perceive  TAM  (as  an  ER  agonist  rather 


than  antagonist).  Several  possible  mechanisms  may 
explain  how  this  switch  occurs  in  MCF-7  cells,  including 
immunologic  effects,  ER  mutations,  and  changes  in 
growth  factor  or  coregulator  expression. 

AIBl  and  tamoxifen-stimulated  growth  as  an  antiestrogen 
resistance  mechanism 

AIB-1  (amplified  in  breast  cancer- 1;  also  known  as 
SRC-3,  RAC3,  TRAM-1,  pCIP,  ACTR)  is  a  steroid 
hormone  receptor  coactivator  located  on  chromosome 
20ql2  (Anzick  et  al,  1997)  that  has  recently  received 
attention  as  a  possible  contributor  to  antiestrogen 
responsiveness.  AIBl  binds  ER  (Azorsa  et  al,  2001), 
enhances  the  expression  of  cyclin  D1  (Planas-Silva  et  al, 
2001),  and  exhibits  somatic  instability  in  some  breast 
cancers  (Dai  et  al,  2002).  AIBl’s  function  as  an  ER 
coactivator  produces  increased  transcriptional  activa¬ 
tion  of  ER  (Anzick  et  al,  1997).  A  novel  AIBl  isoform 
(AIB-A3)  has  been  recently  reported  that  increases 
hormone  and  growth  factor  sensitivity  (Reiter  et  al, 
2001)  and  increases  the  estrogenicity  of  4-hydroxy  ta¬ 
moxifen  to  a  greater  degree  than  wild-type  AIBl  (Dr 
Anna  Riegel,  Georgetown  University  Medical  School, 
personal  communication).  The  mRNA  for  AIB-A3  was 
detected  at  levels  higher  than  normal  cells  in  7/8  breast 
cancers  (Reiter  et  al,  2001). 

The  data  in  Table  4  show  some  of  the  characteristics 
of  AIBl  amplification  and  expression  in  breast  cancers. 
Most  studies  have  explored  either  gene  amplification 
(found  in  <  10%)  or  mRNA  expression  (reported  in  10- 
64%  of  breast  tumors).  One  study  reported  AIBl 
protein  expression  as  being  above  that  seen  in  normal 
breast  cells  in  approximately  10%  of  breast  cancers  by 
immunohistochemistry.  Protein  expression  was  detected 
at  levels  similar  to  or  greater  than  those  seen  in  normal 
breast  cells  in  about  60%  of  ER  -f  tumors. 

The  association  of  AIBl  with  ER  status  is  difficult  to 
determine  from  the  small  number  of  studies  available. 
While  AIBl  amplification  has  been  associated  with  ER- 
positivity  (Anzick  et  al,  1997),  increased  AIBl  mRNA 
expression  has  been  associated  with  ER-negativity 
(Bouras  et  al,  2001).  Similar  proportions  of  detectable 
and  undetectable  AIBl  protein  levels  (^65%)  were 


Table  4  AIB I  amplification  and  expression  in  breast  cancer  (representative  studies) 


DNA  amplification 

mRNA  overexpression 

Protein 

Study 

10/105  (9.5%) 

48/75 

(64%  relative  to  normal) 

Not  reported 

Anzick  et  al  (1997) 

56/1157  (4%) 

ER-  10/429  (2.3%) 

ER-I-  45/769  (5.9%) 

Not  reported 

Not  reported 

Bautista  et  al  (1998) 

No  data 

26/83  (31%) 

High  AIBl:  ER+  11/26  (42%) 

Low  AIBl:  ER+  44/55  (80%) 

Not  reported 

Bouras  et  al.  (2001) 

Not  detected  (0%) 

3/23(13%) 

Not  reported 

Glaescr  et  al  (2001) 

20/259  (7.7%) 

Not  reported 

Not  reported 

Cuny  et  al  (2000) 

Not  reported 

Not  reported 

4/41 

(9.8%  relative  to  normal) 

Present:  ER+  11/16(69%) 
Absent:  ER+  12/21  (57%) 

List  et  al  (2001) 
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found  in  ER+  tumors  (12/21  had  undetectable  expres¬ 
sion;  11/16  had  detectable  expression);  no  significant 
correlation  between  AIBl  and  either  ER  or  PR  was 
found  (List  et  al.,  2001). 

Approximately  10%  of  all  ER+  breast  tumors  may 
overexpress  wild-type  AIBl  protein  (List  et  al,  2001).  It 
remains  to  be  seen  if  this  10%  is  primarily  comprised  of 
TAM-stimulated  tumors,  and/or  those  tumors  that 
exhibit  AIBl  gene  amplification.  One  recent  study 
compared  AIBl  (western)  and  erbB2  expression.  The 
5-year  disease-free  survival  was  lower  in  those  tumors 
expressing  high  levels  of  both  AIBl  and  erbB2  when 
compared  with  those  expressing  high  levels  of  AIBl  and 
low  levels  of  erbB2.  AIBl  and  number  of  positive  lymph 
nodes  were  also  correlated  with  shorter  disease-free 
survival  in  TAM-treated  compared  with  untreated 
patients  (Osborne  et  al,  2003). 

Overexpression  of  AIBl  and  AIBl -A3  can  confer  a 
TAM-stimulated  phenotype  that  should  also  be  estrogen 
responsive  (Dr  Anna  Riegel,  Georgetown  University 
Medical  School,  personal  communication).  The  propor¬ 
tion  of  AIBl -overexpressing  cells  that  are  dependent 
upon  this  activity  for  survival/proliferation  is  unknown. 
The  proportion  of  breast  biopsies  that  respond  mito- 
genically  to  both  TAM  and  estradiol  in  short-term 
culture  (4%;  see  below)  suggests  that  up  to  one-half  of 
AIBl -overexpressing  tumors  might  be  TAM-stimulated. 
Since  these  tumors  are  predicted  to  retain  estrogen 
responsiveness,  and  may  still  synthesize  estrogens,  many 
likely  retain  responsiveness  to  aromatase  inhibitors. 

The  AIBl -overexpressing  phenotype  is  broadly  simi¬ 
lar  to  some  MCF-7  TAM-stimulated  xenograft  models. 
Since  wild-type  MCF-7  cells  already  overexpress  AIBl 
(Azorsa  et  al,  2001)  and  the  AIBl -A3  (Reiter  et  al, 
2001),  it  is  not  surprising  that  selection  against  TAM 
might  produce  a  TAM-stimulated  phenotype.  Indeed, 
this  phenotype  is  already  present  in  some  MCF-7  cells 
without  TAM  selection  (Dumont  et  al,  1996).  It 
remains  to  be  seen  whether  this  model  is  primarily 
driven  by  an  overexpression  of  wild-type  AIBL  Since 
the  AIBl -A3  was  identified  in  MCF-7  cells  and  is  more 
potent,  this  isoform  may  also  contribute  to  the 
phenotype  of  these  xenografts  and  some  human  breast 
cancers.  Indeed,  this  variant  may  prove  to  be  more 
relevant  in  a  broader  context  because  of  its  ability  to 
also  affect  growth  factor  signaling,  an  effect  that  could 
be  important  in  both  ER  4-  and  ER-  cells  (Reiter  et  al, 
2001). 

Clinical  relevance  of  the  tamoxifen-stimulated  phenotype 
as  an  antiestrogen  resistance  mechanism 

Direct  evidence  of  a  TAM-stimulated  resistance  pheno¬ 
type  in  breast  cancer  patients  is  difficult  to  find.  Indirect 
evidence  may  be  found  from  studies  that  assessed  the 
frequency  of  a  TAM  withdrawal  response.  These 
responses  are  evident  when  a  tumor  progressing  on 
TAM  regresses  upon  cessation  of  the  TAM  therapy. 
Recently,  we  completed  an  extensive  review  of  the 
literature  and  found  241  cases  in  five  studies  where  the 
authors  looked  specifically  for  evidence  of  TAM  with¬ 


drawal  responses  (Clarke  et  al,  2001b).  Responses  were 
assessed  by  relatively  similar  criteria  and  could  be 
combined  into  three  groups:  complete  response,  partial 
response,  and  worse  than  partial  response.  Evidence  was 
found  for  only  3/241  complete  responses  (1.2%)  and  13/ 
241  partial  responses  (5.4%).  Over  90%  of  cases  (225/ 
241)  experienced  a  worse  than  partial  response  to  TAM 
withdrawal  (225/241;  93.4%). 

Since  breast  tumors  are  highly  heterogeneous,  the 
TAM-stimulated  population  may  not  be  the  dominant 
cell  population  in  most  tumors.  Thus,  elimination  of  the 
TAM-dependent/stimulated  population  may  not  be 
sufficient  to  induce  a  complete  or  partial  clinical 
response  because  the  bulk  of  the  tumor  is  independent 
of  any  T AM-induced  proliferation.  In  our  evaluation  of 
the  literature,  disease  stabilization  was  the  most 
common  beneficial  response  to  TAM  withdrawal. 
Disease  stabilization  might  indicate  tumors  that  contain 
populations  that  are  no  longer  growth-stimulated  by 
TAM  and/or  a  shift  in  the  balance  between  cell  loss/ 
death  and  proliferation.  Whatever  the  mechanisms,  cells 
in  these  tumors  are  clearly  not  primarily  dependent 
upon  TAM  for  survival,  since  the  great  majority  of 
patients  (194/241;  80%)  experienced  disease  progression 
upon  TAM  withdrawal  even  when  disease  stabilization 
is  included  as  a  beneficial  response  (Clarke  et  al,  2001b). 

These  data  imply  that  the  majority  of  tumors  in 
patients  that  progress  on  TAM  treatment  are  not 
progressing  because  they  have  acquired  a  TAM- 
stimulated  phenotype.  Indeed,  the  responses  reported 
for  TAM  withdrawal  may  be  a  mix  of  several  possible 
mechanisms,  including  immunologic  effects  or  other 
mechanisms  not  directly  mediated  through  ER.  Such 
indirect  mechanisms  can  be  largely  eliminated  in  in  vitro 
models.  A  study  of  224  human  breast  cancer  biopsies 
(153  ER-h  and  71  ER— )  used  an  in  vitro  approach  to 
measure  more  directly  the  frequency  of  an  ER-mediated, 
TAM-  and/or  estradiol-stimulated  phenotype  (Nomura 
et  al,  1990).  Primary  cultures  of  breast  cancer  biopsies 
were  studied  for  the  ability  of  TAM  and  estradiol  to 
induce  a  mitogenic  response  in  vitro.  Only  11/153  (7%) 
of  ER+  cultures  exhibited  a  mitogenic  response  to 
TAM,  a  proportion  surprisingly  similar  to  the  propor¬ 
tion  (16/241;  6.6.%)  of  patients  estimated  to  experience 
either  a  complete  or  partial  response  to  TAM  with¬ 
drawal  (Clarke  et  al,  2001b). 

Of  interest  is  the  observation  that  only  6/11  of  the 
TAM-stimulated  tumors  were  also  stimulated  by  estro¬ 
gen  (Nomura  et  al,  1990).  Thus,  the  TAM-  and 
estradiol-stimulated  phenotype,  as  expressed  by  some 
MCF-7  human  breast  cancer  xenografts,  reflected  only 
4%  (6/153)  of  the  phenotypes  of  the  ER+  patient 
biopsies  and  only  50%  of  the  TAM-stimulated  pheno¬ 
types. 

Together,  these  data  imply  that  the  TAM-stimulated 
phenotype  is  only  one  of  several  that  produce  clinical 
resistance.  If  up  to  20%  of  initially  hormone  responsive 
cases  become  TAM-stimulated  to  some  degree  (estimate 
includes  disease  stabilization  responses) — by  whatever 
combination  of  cellular,  molecular,  and/or  immunologic 
mechanisms  this  stimulation  is  conferred — a  significant 
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number  of  women  could  be  affected.  Unfortunately, 
that  still  leaves  the  remaining  80%  at  risk  of  acquiring 
resistance  through  other  mechanisms.  From  existing 
evidence,  the  TAM-  and  estradiol-stimulated  phenotype 
exhibited  by  some  MCF-7  xenografts  may  be  a  minor 
component  of  all  TAM  resistance  phenotypes.  Clearly, 
other  antiestrogen  resistance  mechanisms  exist,  includ¬ 
ing  antiestrogen  unresponsiveness,  and  these  remain  to 
be  identified  and  characterized. 


Gene  networks  in  estrogen  receptor-mediated  cell 
signaling  in  antiestrogen  resistance 

ERa  expression  is  both  necessary  and  sufficient  to 
predict  responsiveness  to  antiestrogens  in  a  high 
proportion  of  breast  tumors.  Thus,  antiestrogen-in¬ 
duced  effects  on  ERa-mediated  signaling  are  almost 
certainly  of  critical  importance  in  effecting  clinical 
responses  in  many  tumors.  Nonetheless,  we  still  do  not 
know  the  genes  responsible  for  signaling  to  these  effects, 
or  whether  the  effects  are  primarily  to  induce  cell  death, 
repress  cell  survival,  or  a  combination  of  both.  As  noted 
above,  ER-independent  events  may  also  interact  with 
ER-mediated  signaling  and  this  may  be  important  in  the 
broader  context  of  a  gene  network  that  regulates 
antiestrogen  responsiveness.  Thus,  estrogens  and  anti¬ 
estrogens  may  differentially  affect  a  gene  network  that 
contains  some  ER-regulated  genes  (Clarke  and  Brunner, 
1995,  1996).  More  recently,  this  concept  has  been 
extended  to  incorporate  the  likely  ability  of  integrated 
signals  to  induce  apoptosis  while  concurrently  blocking 
differentiation  and  proliferation  (Clarke  et  al,  2001c).  It 
is  predicted  that  such  a  network  would  be  affected  by 
TAM  in  TAM-stimulated  models  by  signaling  through 
patterns  similar  to  estradiol.  In  antiestrogen  unrespon¬ 
sive  cells,  signaling  through  this  network  may  use 
different  signaling  patterns  and/or  exhibit  differential 
regulation/expression  of  some  of  the  same  genes  affected 
by  estradiol. 

The  concept  of  a  network  differs  from  that  of  a  signal 
transduction  pathway  in  that  it  requires  the  integration 
of  several  pathways,  de-emphasizes  the  role  of  well- 
established  single  signal  transduction  pathways,  and 
acknowledges  the  likelihood  that  few  complex  pheno¬ 
types  are  likely  to  be  driven  by  a  single  gene/pathway 
(Clarke  et  al,  2001c).  Owing  to  the  plasticity  of  breast 
cancer  phenotypes,  as  illustrated  by  the  diversity  of 
endocrine  resistance  phenotypes  (Clarke  and  Brunner, 
1995),  the  gene  network  concept  seems  reasonable. 
Considering  signaling  within  the  constraints  of  a  single, 
linear  pathway  may  be  inappropriate.  At  best,  such  an 
approach  is  likely  to  produce  an  incomplete  solution;  at 
worst,  it  may  be  misleading. 

Delineating  the  components  of  a  signaling  network 
for  estrogens/antiestrogens  may  not  be  simple  (Clarke 
and  Brunner,  1996).  ERs  regulate  gene  expression 
through  direct  binding  to  EREs  and  direct  interactions 
with  other  transcription  factors  including  AP-1  and  SP- 
1.  The  nature  of  ER  activation  is  affected  by  ligand 
structure,  and  different  ligands  likely  differentially  affect 


the  expression  and  function  of  the  same  members  of  any 
gene  network.  For  example,  raloxifene  may  regulate 
gene  expression  through  novel  pathways  not  affected  by 
TAM  or  ICI  182,780  (Yang  et  al,  1996),  and  as  noted 
above,  antiestrogens  differentially  affect  transcription 
when  bound  to  ERa  compared  with  ERjS.  Regulation  of 
the  entire  network  or  key  components  of  the  network 
may  also  be  affected  by  ER-independent  signaling,  for 
example,  as  intracellular  signals  are  perturbed  by 
tumor-stromal  cell  interactions.  Temporal  and  spatial 
organization  of  signaling  components  in  a  network  is 
also  critical.  The  likely  complexity  of  network  regulation 
has  been  described  elsewhere  (Clarke  et  aL,  2001c). 

Accepting  the  principle  of  a  network  is  technically 
demanding  because  it  requires  experimental  methods  to 
evaluate  concurrently  the  expression  of  multiple  genes 
and  informatic  methods  capable  of  integrating  expres¬ 
sion  pattern  analyses  with  functional  information. 
Methods  to  obtain  such  high-dimensional  data  are  well 
established  and  can  be  used  to  explore  both  the 
transcriptome  and  proteome  of  cells  and  tumors. 
However,  data  analysis  methods  for  exploring  gene 
expression  microarray  or  two-dimensional  gel  electro¬ 
phoresis  data  remain  in  their  infancy  and  it  may  be 
several  years  before  adequate  methods  become  available 
and  widely  accepted. 


We  have  begun  to  apply  both  proteome  (Skaar  et  aL, 
1998)  and  transcriptome  analyses  (Ellis  et  aL,  2002;  Gu 
et  aL,  2002)  to  breast  cancer  cell  lines,  xenografts,  and 
tumors  to  identify  potentially  important  components  of 
a  large  signaling  network  that  may  contribute  to  both 
estrogen  independence  and  acquired  antiestrogen  resis¬ 
tance.  Current  informatic  methods  do  not  provide  an 
easy  way  to  uncover  rapidly  and  correctly  an  entire 
signaling  network.  However,  it  should  be  possible  to 
discover  integral  components  of  an  overall  network  and 
eventually  piece  together  these  components  to  reveal  the 
entire  network’s  structure. 

We  first  identified  appropriate  cellular  models, 
derived  adequate  algorithms  for  data  analysis,  and 
began  to  explore  the  proteomes  by  two-dimensional  gel 
electrophoresis  and  the  transcriptomes  by  serial  analysis 
of  gene  expression  and  gene  expression  microarrays. 
Remarkably  few  antiestrogen  resistance  models  are 
available  for  study,  and  almost  all  are  based  on  the 
MCF-7  human  breast  cancer  cell  line  (reviewed  in 
Clarke  et  aL,  2001b).  MCF-7  xenografts  selected  against 
TAM  almost  exclusively  produce  a  TAM-stimulated 
phenotype,  which  may  not  be  representative  of  the 
majority  of  human  breast  cancers  (see  below).  Thus,  we 
established  several  E2-independent  but  responsive 
breast  cancer  cell  variants  with  differing  antiestrogen 
response  profiles. 

MCF-7  cells  were  first  selected  for  an  ability  to  grow 
in  vivo  in  ovariectomized  athymic  nude  mice.  The 
resulting  variant  (MCF7/MIII)  is  estrogen-independent 
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for  growth  both  in  cell  culture  and  as  xenografts  (Clarke 
et  aL,  1989a),  but  retains  responsiveness  to  antiestro¬ 
gens;  that  is,  it  is  estrogen-independent  but  has  an 
antiestrogen  responsive  phenotype  (Clarke  et  al., 
1989a,  b).  We  further  selected  these  cells  in  vivo  and 
found  that  repeated  in  vivo  estrogen  withdrawal,  which 
generated  the  MCF7/LCCi  variant,  did  not  substan¬ 
tially  change  the  antiestrogen  responsiveness  of  the  cells 
(Brunner  et  al,  1993a).  MCF7/LCC1  cells  were  then 
selected  in  vitro  for  resistance  to  4-hydroxytamoxifen. 
The  resulting  MCF7/LCC2  cells  are  TAM-resistant  but 
ICI  182,780  responsive  (Brunner  et  al,  1993b).  This 
phenotype  predicted  for  the  subsequent  observation  that 
patients  responding  to  TAM,  and  then  acquiring  a 
TAM-resistant  phenotype,  have  a  high  probability  of 
retaining  sensitivity  to  ICI  182,780  (Howell  et  al.,  1995). 
In  marked  contrast,  MCF7/LCC1  cells  selected  for 
resistance  to  ICI  182,780  (MCF7/LCC9  variant)  acquire 
resistance  to  ICI  182,780  and  crossresistance  to  TAM 
(Brunner  et  al,  1997).  These  models  represent  pharma¬ 
cologic  models  of  antiestrogen  resistance  in  the  context 
that  they  no  longer  respond  to  the  growth  inhibitory 
effects  of  antiestrogens.  Models  that  reflect  a  switch  to 
an  antiestrogen-stimulated  phenotype  are  described 
above. 

By  comparing  the  proteomes  and  transcriptomes  of 
several  of  these  MCF7/LCC  variants,  we  have  begun  to 
identify  what  we  believe  is  one  component  of  a  larger 
gene  network  that  may  regulate  antiestrogen  respon¬ 
siveness.  The  relevance  of  this  gene  subset  is  already 
under  intensive  investigation  in  functional  studies 
in  vitro  and  in  vivo  and  for  its  ability  to  improve 
prediction  of  antiestrogen  responsiveness  in  breast 
cancer  patients. 

Candidate  genes 

The  first  goal  in  these  studies  was  to  identify  differen¬ 
tially  expressed  genes  and  proteins  that  might  contribute 
to  acquired  estrogen-independent  and/or  antiestrogen 
resistance.  The  data  in  Table  5  are  adapted  from  our 
most  recent  study  (Gu  et  al,  2002)  and  show  the 
differential  regulation  of  genes  we  use  below  to 


Table  5  Genes  in  a  putative  signaling  network 


Gene^ 

Analysis 

MCF7ILCCI  vj 
MCF7ILCC9^ 

EGFR 

Microarray 

Twofold 

EGR-1 

Microarray 

Threefold 

IRF-1 

Microarray 

Twofold 

NFkB 

Microarray 

0.5-fold 

n-ras-related  gene 

SAGE 

0.5-fold 

Superoxidc  dismutasc 

Microarray 

0.5-fold 

TNFa 

Microarray 

Twofold 

TNF-RI 

Microarray 

Twofold 

X-box  binding  protein- 1 

SAGE 

0.25-fold 

“Links  to  the  Uni  Gene  clusters  for  these  and  other  genes  from  this 
study  can  be  found  at  http://cIarkelabs.gcorgctown.edu/Gu_et_aI/ 
Tables.htm.  ‘’Since  the  fold  differences  arc  relative  to  MCF7/LCC1 
levels,  genes  uprcgulated  in  MCF7/LCC9  cells  arc  expressed  as  a 
fraction 


construct  one  component  of  a  putative  antiestrogen 
responsiveness  signaling  network.  Functional  studies  of 
the  interactions  described  in  this  network  are  currently 
in  progress. 

Comparing  the  MCF7/LCC1  and  MCF-7  proteomes 
identified  nucleophosmin  (NPM)  as  being  associated 
with  estrogen  independence  (Skaar  et  al,  1998).  NPM  is 
a  nucleolar,  DNA/RNA-binding  phosphoprotein 
(Wang  et  al,  1994;  Herrera  et  al,  1995)  that,  when 
overexpressed  in  NIH  3T3  cells,  produces  a  fully 
transformed  phenotype  (Kondo  et  al,  1997).  Down¬ 
regulating  NPM  delays  entry  into  mitosis  (Jiang  and 
Yung,  1999),  perhaps  reflecting  its  differential  phos¬ 
phorylation  by  key  kinases:  p34‘='^''^  kinase  (Peter  et  al, 
1990),  CDK2/cyclin  E  (Tokuyama  et  al,  2001),  and 
protein  kinase  C  (Beckmann  et  al,  1992).  NPM  binds 
the  retinoblastoma  protein  to  induce  DNA  polymerase  a 
(Tchoudakova  et  al,  1999)  and  decreases  susceptibility 
to  butyrate-induced  apoptosis  through  inducing  telo- 
merase  activity  (Liu  et  al,  1999).  Overexpression  of 
NPM  is  seen  in  colorectal  (Nozawa  et  al,  1996)  and 
prostate  cancers  (Bocker  et  al,  1995).  NPM  is 
E2-regulated  in  breast  cancer  cells  (Brankin  et  al, 
1998)  and  anti-NPM  autoantibodies  are  readily  detected 
in  the  sera  of  breast  cancer  patients  (Brankin  et  al, 
1998).  NPM  blocks  the  transcriptional  activator  func¬ 
tions  of  both  YYl  (Inouye  and  Seto,  1994),  which 
regulates  )5-casein  production  in  the  mammary  gland 
(Raught  et  al,  1994),  and  the  putative  tumor  suppressor 
gene  interferon  regulatory  factor- 1  (IRF-1).  NPM 
regulates  the  stability  and  activation  of  p53  (Colombo 
et  al,  2002),  implicating  its  activities  in  p53-medated 
effects  on  apoptosis,  and  p53  is  sequested  in  the  cytosol 
of  TAM-resistant  MCF7/LCC2  cells  (Lilling  et  al, 
2002). 

Exploring  the  MCF7/LCC1  and  MCF7/LCC9  tran¬ 
scriptomes  by  SAGE  identified  several  differentially 
expressed  genes  (Gu  et  al,  2002).  We  discuss  here  only 
the  human  X-box  binding  protein- 1  (XBP-1)  and  the  n- 
ras-related  gene.  XBP-1  is  a  member  of  the  ATF/CREB 
transcription  factor  family  that  activates  promoters 
containing  CREs  (Clauss  et  al,  1996).  During  liver 
regeneration,  XBP-1  is  associated  with  increased  pro¬ 
liferation  and  reduced  apoptosis  (Reimold  et  al,  2000), 
implying  a  survival  function  that  may  explain  the  role  of 
its  overexpression  in  hepatocellular  carcinomas  (Kishi- 
moto  et  al,  1998).  Expressed  within  a  cluster  of  genes 
associated  with  some  ER-f  breast  tumors  (Perou  et  al, 
2000),  we  have  recently  begun  to  explore  XBP-l’s  role  in 
normal  and  neoplastic  breast  cells. 

The  role  of  the  n-ras-related  gene  is  unclear.  Ras 
expression  is  upregulated  in  many  breast  cancers  (Clark 
and  Der,  1995)  and  activates  signaling  through  MAPKs 
that  are  also  regulated  by  growth  factors  implicated  in 
estrogen/antiestrogen  responsiveness  and  mitogenesis 
(Dickson  and  Lippman,  1995;  Clarke  et  al,  2001b; 
San  ten  et  al,  2002).  These  MAPKs  have  been  implicated 
in  phosphorylating  and  activating  ERs,  an  effect  that 
could  influence  antiestrogen  responsiveness  (Clarke 
et  al,  2001b;  San  ten  et  al,  2002).  However,  some  recent 
studies  suggest  that  MAPK’s  effects  on  ER  do  not 
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influence  antiestrogen  responsiveness  (Atanaskova  et  al., 

2002). 

Exploring  the  MCF7/LCC1  and  MCF7/LCC9  tran- 
scriptomes  by  gene  expression  microarrays  implicated 
several  genes  including  IRF-1,  nuclear  factor- k:B 
(NFkB),  early  growth  response  gene-1  (EGR-1),  epi¬ 
dermal  growth  factor  receptor  (EGFR),  and  both  tumor 
necrosis  factor-alpha  (TNFa)  and  its  receptor  TNF-Rl 
(Gu  et  al,  2002).  While  initially  identified  as  an 
interferon-induced  gene,  IRF-1  has  now  been  implicated 
in  regulating  several  critical  cellular  functions  and  is  a 
putative  tumor  suppressor  in  some  cancers  (Tanaka 
et  al,  1994a, b;  Yim  et  al.,  1997).  IRF-l’s  tumor 
suppressor  activities  may  be  related  to  its  ability  to 
signal  to  apoptosis  (Tanaka  et  al,  1994a),  which  can 
occur  in  a  p53-dependent  or  -independent  manner 
(Tamura  et  al,  1995;  Tanaka  et  al,  1996),  with  or 
without  induction  of  (Tanaka  et  al,  1996)  or 

p27‘‘‘P'  (Moro  et  al,  2000),  and  through  caspase-1 
(Tamura  et  al,  1995),  -7  (Sanceau  et  al,  2000)  -8, 
(Suk  et  al,  2001),  and/or  Fas-ligand  (Chow  et  al,  2000). 
Potentially  related  to  these  activities  is  the  ability  of 
SAPK  p38,  which  is  involved  in  signaling  to  apoptosis  in 
response  to  stress,  to  activate  IRF-1 /interferon-stimu¬ 
lated  reponse  element  binding  (Varley  and  Dickson, 
1999).  Consistent  with  putative  tumor  suppressor 
activities,  one  small  immunohistochemical  study  reports 
reduced  IRF-1  expression  in  neoplastic  normal 
human  breast  tissues  (Doherty  et  al,  2001). 

The  consequence  of  NFkB  activation  is  cell  context 
specific  (Voegel  et  al,  1996),  but  it  is  generally 
considered  antiapoptotic  in  most  cancer  cells.  Several 
aspects  of  normal  mammary  gland  development  appear 
dependent  upon  NFfcB  activity  (Clarkson  and  Watson, 

1999) ,  likely  reflecting  its  regulation  by  both  estrogens 
and  growth  factors  (Nakshatri  et  al,  1997;  Biswas  et  al, 

2000) .  Elevated  NFk:B  activity  arises  early  during 
neoplastic  transformation  in  the  rat  mammary  gland 
(Kim  et  al,  2000).  Widely  expressed  in  human  and  rat 
mammary  tumors  (Sovak  et  al,  1997;  Cogswell  et  al, 
2000),  upregulation  of  NFkB  is  associated  with  estrogen 
independence  (Nakshatri  et  al,  1997;  Clarkson  and 
Watson,  1999).  NFkB  is  the  only  protein  known  to 
induce  BRCA2  expression  (Welcsh  and  King,  2001). 
Several  excellent  reviews  on  NFkB  signaling  are 
available  (Bours  et  al,  2000;  Baldwin,  2001;  Karin 
et  al,  2002). 

EGR-1  is  a  transcription  factor  with  proapoptotic 
activity  (Das  et  al,  2000)  and  is  downregulated  in 
DMBA-induced  mammary  adenocarcinomas  in  rats  and 
mouse  and  human  breast  cancer  cells  (Huang  et  al, 
1997).  c-myc  is  a  major  regulator  of  breast  cancer 
proliferation  and  survival  (Liao  and  Dickson,  2000)  and 
is  among  the  genes  downregulated  by  EGR-1  (Hoffman 
et  al,  2002).  EGR-1  also  blocks  NF/cB  function 
(Chapman  and  Perkins,  2000)  and  can  stimulate 
apoptosis  through  cooperation  with  p21'*''‘‘‘’^‘''p’  and 
transactivation  of  p53  (Liu  et  al,  1998).  Superoxide 
dismutase  (SOD)  expression  is  increased  in  MCF7/ 
LCC9  cells  (Gu  et  al,  2002)  and  in  TAM-stimulated 
MCF-7  xenografts  (Schiff  et  al,  2000);  SOD  over- 


expression  was  previously  implicated  in  resistance  to 
TNFoc  (Zyad  et  al,  1994),  A  TNFa-mediated  pathway 
for  signaling  to  apoptosis  occurs  in  MCF-7  cells  (Burow 
et  al,  1998;  Egeblad  and  Jaattela,  2000),  and  measuring 
serum  TNF  concentrations  may  be  a  useful  prognostic 
marker  in  breast  cancer  patients  (Sheen-Chen  et  al, 
1 997) .  Furthermore,  IRF- 1  expression  is  induced  by  TNFa 
in  some  cells  (Mori  et  al,  1999). 


One  component  of  a  gene  network 

Using  the  data  from  our  proteome  and  transcriptome 
studies  and  from  other  published  studies,  we  have  begun 
to  construct  a  gene  expression  network  for  signaling  in 
antiestrogen  responsiveness  (Figure  3).  Studying  a 
variant  that  is  crossresistant  to  triphenylethylenes  and 
steroidal  antiestrogens  (MCF7/LCC9)  provided  the 
opportunity  to  identify  more  broadly  based  resistance 
signaling  than  might  be  obtained  from  a  study  of  TAM- 
only  resistance  (e.g.,  MCF7/LCC2  phenotype).  The 
apparent  consistency  of  the  interactions  among  the 
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Figure  3  Part  of  a  putative  gene  expression  network  constructed 
from  the  genes  differentially  expressed  in  MCF7/LCC9  cells  (TAM 
and  ICI  182,780  crossresistant)  and  their  sensitive  MCF7/LCC1 
parent  cells.  Candidate  genes  from  other  studies  are  also 
incorporated  into  the  network.  Arrows  represent  those  genes  with 
altered  expression,  and  the  consequences  of  these  changes  are 
represented  in  the  context  of  an  antiestrogen-resistant  phenotype. 
For  example,  the  low  levels  of  IRF-1  in  MCF7/LCC9  cells  are 
unable  to  induce  EGFR,  which  remains  low  in  these  cells. 
Redundancy  is  evident;  for  example,  the  upregulation  of  NFk'B 
and  ras  may  compensate  for  low  EGFR  expression  because  they 
signal  downstream  of  the  EGFR’s  kinase  activity.  Signaling 
through  this  network  component  is  expected  to  be  different 
between  sensitive  and  resistant  cells  and  likely  also  different  among 
some  populations  with  the  same  phenotype.  For  example,  not  all 
resistant  cells  need  to  modify  gene  expression  in  the  same  pattern  as 
apparently  adopted  by  MCF7/LCC9  cells.  Since  ER-mediated 
effects  are  critical  in  antiestrogen-induced  signals  in  sensitive  cells, 
these  cells  may  signal  through  the  network  component  primarily 
comprising  ER-regulated  genes.  While  the  interactions  in  this  figure 
arc  consistent  with  published  data,  the  network  as  represented  is 
not  intended  to  be  complete  and  the  regulation  of  some  genes  may 
be  more  complex  than  alluded  to  here.  As  we  further  evaluate 
signaling  in  these  cells,  we  may  identify  additional  components  of 
this  network.  {}  =  receptor-ligand  complex;  |  =  expression  is 
increased;  I  =  expression  is  reduced;  other  arrows  show  direction 
of  signal  transduction;  1  =  inhibition  of  indicated  genc/function; 
—•= inability  to  induce  substantially  next  signal  or  influence  next 
event  due  to  low/rcduccd  expression/activity 
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relatively  few  genes  incorporated  into  our  network 
component  is  surprising.  EGF-R  induces  expression  of 
EGR-1  (Tsai  et  al,  2000),  and  expression  of  both  genes 
is  lower  in  MCF7/LCC9  cells  (Gu  et  al,  2002).  Since 
EGR-1  inhibits  NF/cB  function  (Chapman  and  Perkins, 
2000),  its  low  expression  may  contribute  to  the  increased 
NFkB  activity  in  these  cells  (Gu  et  al,  2002).  IRF-1 
induces  EGF-R  mRNA  (Rubinstein  et  al,  1998),  and 
IRF-1  levels  are  lower  in  MCF7/LCC9  cells  (Gu  et  al, 
2002).  IRF-1  is  induced  by  TNFa/TNF-Rl  (Mori  et  al, 
1999),  both  of  which  are  also  concurrently  down- 
regulated  in  MCF7/  LCC9  cells,  perhaps  explaining 
their  lower  IRF-1  levels.  IRF-1  can  act  as  a  tumor 
suppressor  and  signal  to  apoptosis  through  both  p53- 
dependent  and  -independent  pathways  (Taniguchi, 
1997).  These  observations  may  reflect  IRF-l’s  ability 
to  affect  caspase  activity,  since  caspase  activation  and 
induction  of  apoptosis  are  implicated  in  affecting 
antiestrogen  responsiveness  (Mandlekar  et  al, 
2000a,  b).  Overexpression  of  caspase- 1,  which  regulates 
apoptosis  in  normal  mammary  epithelial  cells 
(Boudreau  et  al,  1995),  is  known  to  be  lethal  in  MCF- 
7  cells  (Keane  et  al,  1996).  In  these  models,  signaling 
through  caspase-3  is  unlikely  because  the  gene  is 
truncated  in  MCF-7  cells  (Friedrich  et  al,  2001); 
signaling  through  caspase-7  may  dominate. 

Interferons  (IFNs)  and  TNF  act  synergistically  to 
induce  gene  expression,  an  effect  that  appears  driven  by 
protein-protein  interactions  between  IRF-1  and  NFkB 
(Drew  et  al,  1995;  Neish  et  al,  1995).  IRF-1  can  induce 
degradation  of  I/cBa  in  some  cells  (Kirchoff  et  al,  1999). 
IRF-1  :NFkB  heterodimers  affect  expression  of  the 
ATF-2/jun  (Escalante  et  al,  1998),  RANTES  (Lee 
et  al,  2000),  VCAM-1  (Neish  et  al,  1995),  IL-6 
(Sanceau  et  al,  1995),  and  MHC  class  1  genes  (Drew 
et  al,  1995).  Altered  AP-1  expression  (includes  jun)  is 
implicated  in  the  TAM-stimulated  antiestrogen  resis¬ 
tance  phenotype  (Schiff  et  al,  2000),  RANTES  expres¬ 
sion  correlates  with  a  poor  prognosis  (Luboshits  et  al, 
1999),  VCAM-1  is  involved  in  angiogenesis  and 
metastasis  in  breast  tumors  (Byrne  et  al,  2000),  and 
autocrine  production  of  IL-6  is  associated  with  drug 
resistance  in  breast  cancer  cells  (Conze  et  al,  2001). 

Unlike  IRF-1,  NPM  expression  is  increased  in 
MCF7/LCC9  cells  compared  with  MCF7/LCC1  cells. 
NPM  can  function  as  an  oncogene,  its  overexpression 
fully  transforming  NIH  3T3  cells  in  an  assay  for 
oncogenic  potential  (Kondo  et  al,  1997).  Levels  of 
autoantibodies  to  NPM  increase  in  patients  6  months 
prior  to  recurrence.  Consistent  with  an  antiestrogenic 
regulation  of  NPM,  the  levels  of  NPM  autoantibodies 
are  lower  in  breast  cancer  patients  who  received  TAM 
(Brankin  et  al,  1998).  Concurrent  upregulation  of  NPM 
and  downregulation  of  IRF-1  suggest  a  novel  signaling 
pathway  in  antiestrogen  resistance.  Both  are  estrogen- 
regulated  genes  in  MCF-7  cells,  IRF-1  expression  being 
suppressed  and  that  of  NPM  being  induced  (Skaar  et  al, 
1998,  2000).  Through  its  direct  binding  to  IRF-1,  NPM 
inhibits  the  transcription  regulatory  activities  of  IRF-1 
(Kondo  et  al,  1997).  Overexpression  of  NPM  may 
eliminate  the  remaining  IRF-1  activity,  blocking  its 


ability  to  initiate  an  apoptotic  caspase  cascade,  and/or 
induce  p2P"^^^“p’  (Coccia  et  al,  2000)  and  cooperate 
with  p53  in  signaling  to  growth  arrest  and  apoptosis 
(Tanaka  et  al,  1994a,  1996). 

XBP-1  acts  through  its  ability  to  regulate  genes 
containing  CRE  in  their  promoters  (Clauss  et  al, 
1996).  A  cAMP-dependent  pathway  that  inhibits  IRF- 
1  transactivation  has  been  described  (Delgado  et  al, 
1999);  XBP-1  activation  of  this  pathway  could  suppress 
further  the  already  low  IRF-1  activity  in  some 
antiestrogen-resistant  cells. 

N-ras-induced  signaling  may  also  be  important  and 
implies  an  upregulation  of  ras-induced  signaling  in 
resistant  cells.  Such  increased  signaling  may  partly 
abrogate  the  need  for  growth  factor-induced  signaling 
through  autocrine,  paracrine,  or  intracrine  stimulation 
(Clarke  et  al,  2001b)  because  increased  ras  activation 
is  downstream  of  several  growth  factor  receptors 
implicated  in  breast  cancer  (Santen  et  al,  2002). 
For  example,  cells  may  be  capable  of  surviving  when 
EGFR  levels  are  reduced  (Table  5)  because  loss  of 
EGFR  signaling  is  compensated  by  a  downstream 
upregulation  of  ras-mediated  signaling.  Low  IRF-1 
expression  may  also  contribute  to  the  effects  of  ras 
signaling  because  IRF-1  induces  lysyl  oxidase  (Sers  et  al, 
2002),  which  is  implicated  in  reversing  ras-induced 
malignant  transformation  (Contente  et  al,  1999;  Noza- 
wa  et  al,  1999). 

Some  of  the  genes  we  found  have  been  implicated  in 
antiestrogen  resistance  in  other  studies,  most  notable 
being  EGF-R  (Nicholson  et  al,  2001)  and  its  family 
member  c-erbB2  (Kurokawa  et  al,  2000;  Welch  and 
Clarke,  2002;  Konecny  et  al,  2003).  AKT  (Perez- 
Tenorio  and  Stal,  2002),  c-myc  (Carroll  et  al,  2002), 
cyclin  D1  (Varma  and  Conrad,  2002),  p53, 

(Fattman  et  al,  1998),  and  AP-1  (Schiff  et  al,  2000)  may 
also  contribute  to  antiestrogen  responsiveness.  We  have 
incorporated  some  of  this  knowledge  into  the  network 
in  Figure  3,  particularly  where  these  genes  may  interact 
with  those  identified  in  our  models.  Several  genes  are 
thought  to  be  downstream  of  signaling  from  growth 
factor  receptors  implicated  in  either  phosphorylating/ 
activating  ER  and/or  inducing  mitogenesis  and  affecting 
antiestrogen  responsiveness  (Chan  et  al,  2001;  Varma 
and  Conrad,  2002).  For  example,  the  type  I  insulin-like 
growth  factor  receptor  and  c-erbB2  can  activate  AKT, 
which  is  often  upstream  of  NF/cB  (Martin  et  al,  2000). 
Several  growth  factors  activate  MAPK  signaling  to 
mitogenesis  and  signal  through  activation  of  ER.  For 
simplicity,  we  have  not  shown  all  of  these  possible 
interactions  in  Figure  3. 


Functional  studies 

We  acknowledge  that  the  gene  network  component  in 
Figure  3  is  somewhat  speculative.  Furthermore,  it  is 
unlikely  to  be  regulated  in  the  same  way  in  TAM- 
stimulated  models  that  perceive  TAM  as  an  estrogen. 
For  example,  in  TAM-stimulated  models,  key  network 
components  could  be  perturbed  in  the  same  manner  as 
expected  with  estradiol  treatment. 
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One  approach  to  assessing  the  likely  validity  of 
selected  genes  in  our  network  component  is  to  explore 
their  functional  activities  and  abilities  to  affect  anti¬ 
estrogen  responsiveness  in  experimental  models.  We 
have  begun  several  studies  to  further  assess  the  likely 
functional  relevance  of  our  observations  and  support 
the  gene  network  component  in  Figure  3.  Transcrip¬ 
tional  activation  of  XBP-1  and  NFkB  was  studied  using 
established  promoter-reporter  assays  (CRE  promoter- 
reporter  assay  for  XBP-1).  As  predicted  in  the  tran- 
scriptome  analyses,  increased  basal  transcription  of  both 
promoters  was  observed.  Further  studies  showed  that 
the  ability  of  ICI  182,780  to  inhibit  NFkB  activation  is 
lost  in  the  resistant  cells.  Preliminary  data  from  our 
laboratory  imply  that  the  ability  of  antiestrogens  to 
induce  IRF-1  is  also  lost  in  resistant  cells  (Bouker  et  al., 
2002).  Consistent  with  our  earlier  hypotheses  (Clarke 
and  Lippman,  1992),  these  data  show  significant 
changes  in  the  endocrine  regulation  of  some  ER- 
regulated  genes.  We  found  no  evidence  for  endocrine 
regulation  of  CRE  activation  in  either  responsive  or 
resistant  cells.  However,  resistant  cells  exhibit  a  sig¬ 
nificant  fourfold  increase  in  CRE  activation,  reflecting 
the  fourfold  increase  in  its  expression  predicted  from  the 
SAGE  study.  These  observations  suggest  at  least  some 
general  resistance  mechanisms:  an  overexpression  and 
loss  of  endocrine  regulation  of  some  genes  that  are  ER- 
regulated  in  responsive  cells,  a  downregulation  and  loss 
of  endocrine  regulation  of  some  genes  that  are  ER- 
regulated  in  responsive  cells,  and  an  upregulation  of 
some  endocrine  unresponsive  genes. 

To  study  functional  relevance  further,  the  sensitivity 
of  our  variants  to  inhibition  of  NF/cB  activation  by 
parthenolide  was  explored.  Parthenolide,  which  is 
currently  in  early  clinical  trials,  binds  NFkB  in  a  highly 
stereospecific  manner  (Garcia-Pineres  et  ai,  2001)  and 
inhibits  the  IkB  kinase  repressor  of  NFkB  (Hehner  et  al, 
1999;  Patel  et  al,  2000).  We  would  expect  that,  if  NFkB 
is  providing  a  survival  function,  MCF7/LCC9  cells 
might  be  more  dependent  upon  this  activity.  Indeed, 
MCF7/LCC9  cells  are  significantly  more  sensitive  to 
growth  inhibition  by  parthenolide  than  their  MCF7/ 
LCCl  parental  cells  (Gu  et  al,  2002).  Thus,  some  cells 
may  survive  antiestrogen  exposure  by  upregulating 
estrogen-regulated  survival  factor(s)  concurrent  with 
the  loss  of  their  ER-mediated  regulation.  While  we  first 
need  to  confirm  and  extend  these  observations,  parthe¬ 
nolide  may  prove  useful  in  combination  with  Faslodex 
or  other  antiestrogens  to  either  increase  responsiveness 
and/or  delay  the  appearance  of  resistant  disease. 
Functional  studies  into  the  activities  of  the  other  genes 
in  this  network  and  investigations  into  their  power  to 
better  predict  antiestrogen  responsiveness  in  patients  are 
in  progress. 

Conclusions  and  future  prospects 

Acquired  antiestrogen  resistance  likely  comprises  both 
true  antiestrogen  unresponsiveness  (the  major  pheno¬ 


type)  and  antiestrogen-stimulated  growth  (probably  a 
minor  phenotype).  Several  resistance  mechanisms  exist 
and,  with  the  exception  of  loss  of  ER  expression,  these 
mechanisms  may  not  be  driven  by  a  single  gene  or  single 
signaling  pathway.  Consequently,  we  continue  to  devel¬ 
op  the  concept  that  an  integrated  gene  network  exists 
that  allows  cells  a  significant  degree  of  plasticity  in  how 
they  signal  through  this  network  (Clarke  and  Brunner, 
1995,  1996;  Clarke  et  al,  2001c).  More  recently,  we  have 
begun  to  identify  candidate  genes  in  one  component  of 
this  network  and  to  explore  their  likely  functional 
relevance  in  experimental  models  and  ability  to  predict 
patient  outcome.  As  we  and  others  explore  the 
transcriptomes  and  proteomes  of  experimental  models 
and  patient  samples,  additional  components  of  this 
network  may  become  apparent.  Ultimately,  under¬ 
standing  how  breast  cancer  cells  coordinate  a  response 
to  antiestrogens,  and  overcome  the  growth  inhibitory 
nature  of  the  resulting  signaling,  may  lead  to  better 
treatments  and  more  powerful  predictors  of  clinical 
response. 

Some  dietary  components  can  modify  the  ability  of 
TAM  to  inhibit  the  growth  of  ER  -h  and  perhaps  also 
ER-  breast  cancer  cells.  These  dietary  components 
might  be  those  that  alone  are  believed  to  affect 
recurrence  of  breast  cancer.  However,  when  consumed 
in  combination  with  TAM,  various  dietary  components 
could  either  potentiate  or  inhibit  TAM’s  actions. 
Examples  of  unexpected  findings  are  the  studies  of  Ju 
et  al  (2002)  and  Depypere  et  al  (2000),  who  showed 
that  genistein  or  tangeretin  prevents  TAM  from 
inhibiting  growth  of  malignant  breast  cells.  Currently, 
only  a  few  published  studies  have  examined  the  impact 
of  nutrition  on  TAM’s  therapeutic  effects,  and  it  is  likely 
that  other  dietary  factors  can  modify  TAM’s  ability  to 
inhibit  breast  cancer  growth. 

The  clinical  use  of  antiestrogens,  and  TAM  in 
particular,  may  change  in  the  future.  Data  from  some 
recent  studies  suggest  that  the  current  generation  of 
aromatase  inhibitors  may  be  more  effective  than 
antiestrogens  as  first-line  endocrine  treatment  for 
ER  -F  metastatic  breast  cancer  and  as  adjuvant  therapy 
for  ER“h  breast  primaries  (Buzdar  and  Howell,  2001; 
Ellis  et  al,  2001).  Nonetheless,  the  American  Society  of 
Clinical  Oncology’s  Technology  Assessment  Working 
Group  continues  to  recommend  5  years  of  adjuvant 
TAM  as  the  standard  therapy  for  women  with  ER-h 
breast  cancer  (Winer  et  al,  2002).  In  terms  of 
chemoprevention,  the  recommendations  include  the 
use  of  TAM  vs  participation  in  a  clinical  trial  that 
involves  the  administration  of  raloxifene,  any  aromatase 
inhibitor,  or  any  retinoid  only  within  the  context  of 
chemoprevention  (Chlebowski  et  al,  2002). 
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High  expression  of 
efbB2  is  associated 
with  more 
aggressive  cancers 


The  mechanism  by 
which  erbB2 
promotes  drug 
resistance  is 
unknown 


Introduction 

Identification  of  molecular  markers  that  could 
serve  as  accurate  prediaors  of  response  to  specific 
cytotoxic  chemotherapies  would  be  useful  in  tar¬ 
geting  such  therapies  at  individual  patients.  One 
potential  molecular  marker  is  erbB2,  a  transmem¬ 
brane  receptor  tyrosine  kinase  and  member  of  the 
EGFR  superfamily,  which  has  been  implicated  in 
the  generation  and/or  progression  of  a  number  of 
different  carcinomas,  notably  tumours  of  the 
breast.  ^-4  Increased  expression  of  erbB2  is  fre¬ 
quently  associated  with  more  aggressive  cancers 
and  has  been  implicated  in  conferring  resistance 
to  some  drugs. 

Consequently  several  erbB2-specific  tlierapeutic 
agents  have  been  developed  to  target  tumours  that 
specifically  overe3q)ress  this  protein.  Herceptin'*'*^^ 
(trastuzumab),  an  anti-erbB2  monoclonal  anti¬ 
body  and  the  first  erbB2  approach  to  be  approved 
for  use,  has  shown  aaivity  in  some  breast  cancers 
and  may  improve  response  to  specific  cytotoxic 
drugs.5.6  The  aaivity  of  Herceptin™  has  thus 


far  been  attributed  to  blocking  the  mitogenic 
growth  signalling  driven  by  erbB2  and/or  eliciting 
an  antitumour  immune  response.  ^ 

The  mechanisms  by  which  erbB2  promote  drug 
resistance  have  not  been  established.  Nor  has  the 
validity  of  using  erbB2  expression  to  predia 
potential  drug  resistance  been  proven. 

This  re\dew  summarises  and  discusses  the  role  of 
erbB2  as  a  molecular  marker  for  therapeutic  inter¬ 
vention  and  drug  resistance  in  cancer. 

Background 

Increased  expression  of  erbB2  has  been  observed 
in  many  solid  tumours  (table  1),  including 

breast, 8.9  prostate, ovarian,^^  colorectal, ^3,14 

endometrial,  *5  and  non-small-cell  lung  can¬ 
cers,  i®- is  In  rnost  cases,  increased  expression  of 
erbB2  is  associated  with  poor  prognosis  and  is 
correlated  with  deaeased  relapse-free  and  overall 
survival.  1045.19,20  An  exception  may  be  colorectal 
tumours,  which  do  not  consistently  demonstrate 


liable  1,  Inota^d  levels  of  erbB2  expression  In  select^  solid  tumours 


Ettolorectal  .  v  - 


|:^donietrial 
|Non«$mall  cdl  lung 


Observed  increased  erbB2 
i^ressioii 


\  ^19-65%  (varies  wiib  stage) 


|;:-;Cwan.an ' 

|<Trostate  '■ 
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a  dear  relationship  between  erbB2  expression  and 

prognosis.2E22 

Despite  tlie  frequent  assodation  of  erbB2  expres¬ 
sion  with  more  aggressive  tumours,  the  precise 
mechanisms  involving  tumour  progression  are 
not  dearly  defined.  However,  several  possibilities 
have  been  proposed  that  may  either  work  in  con¬ 
cert  or  funaion  independently  in  a  tissue-specific 
manner.  For  example,  erbB2  up-regulation  can 
enhance  the  activation  of  Aki,  a  serine-threonine 
kinase  involved  in  antiapoptotic  signalling.23  in 
breast  tumour  biopsies,  incjeased  erbB2  expres¬ 
sion  correlates  with  enhanced  Akt-mediated  arti- 
vation  of  NF-kB,  a  transcription  factor  known  to 
inaease  ilie  production  of  cell  survival  proteins. 24-26 
erbB2  levels  have  also  been  correlated  with  both 
enhanced  Alrt  expression  and  Akt-mediaied  resist¬ 
ance  to  apoptosis  induced  by  either  UV  irradiation 
or  hypoxia  in  breast  cancer  cells.24  Similar  find¬ 
ings  have  also  been  demonstrated  in  prostate, 
ovarian28  and  non-small-cell  lung  cancer  cell 
lines.29 

erbB2  and  drug  resistance 

Several  signalling  pathways  affecting  apoptosis 
may  be  influenced  by  erbB2  activity.  For  example, 
elevated  erbB2  ejpression  correlates  with  greater 
resistance  to  tumour  necrosis  factor  (TNF)-a- 
induced  apoptosis  in  several  in  vitro  models, 
including  breast  and  ovarian  cancer  cells,24 
HIH  3T3  cells^o  and  cervical  carcinoma  cells.^^ 

erbB2-mediated  resistance  appears  to  be  depen¬ 
dent  upon  Akt  aaivation.28  Conversely,  inhibition 
of  erbB2  signalling  enhances  TNF-a-mediated 
apoptosis  in  breast,  ovarian2S  and  lung  cancer 
cells.32 

Other  molecular  mechanisms  that  may  con¬ 
tribute  to  erbB2-mediated  drug  resistance  include 
disruption  of  cell  cycle  checkpoint  proteins, 
increased  signalling  through  growth-promoting 
pathways,  and  modulation  of  oestrogen  receptor 
(ER)  function.  The  likely  role(s)  of  these  effects  in 
drug  resistance  are  described  below,  beginning 
with  the  possible  effects  of  erbB2  expression  on 
response  to  anti-oestrogens. 

Although  erbB2  expression  has  been  widely 
implicated  in  affecting  response  to  various  anti¬ 
neoplastic  drugs,  the  typical  association  of  erbB2 
overexpression  with  poor  clinical  outcome  com¬ 
plicates  the  assessment  of  its  role  in  drug  resist¬ 
ance  in  some  studies.  Tumours  with  a  poor 
prognosis  may  have  a  poor  clinical  outcome  irre¬ 
spective  of  treatment,  reflecting  their  biological 
progression  rather  than  any  specific  drug  resist¬ 
ance.  When  examined  in  breast  cancer  models, 
such  as  normal  mammary  epithelial  cells  or 


human  breast  cancer  cell  lines  coaxed  to  overex¬ 
press  erbB2  by  transfection,33  erbB2  overexpres¬ 
sion  does  not  confer  drug  resistance.  However, 
co-expression  of  erbB2  with  other  EGFR  family 
members  can  produce  resistance  to  several 
chemotherapeutic  agents  commonly  used  to  treat 
breast  cancer ^4  These  observations  suggest  that 
cellular  context  (the  pattern  of  other  genes/pro¬ 
teins  expressed  within  a  cellos)  can  significantly 
affea  erbB2  signalling  and  drug  responsiveness. 

erbB2-mediated  resistance  to  tamoxifen 

Resistance  to  the  triphenyletliylene  anti-oestrogen 
tamoxifen  has  been  correlated  witli  erbB2  expres¬ 
sion  in  several  in  vitro  studies, 36-39  but  the  mecha¬ 
nisms  are  unclear.35  Protein-protein  interactions 
between  erbB2  and  ER  have  been  described  in  cell 
membranes  and  may  protect  breast  cancer  cells 
from  tamoxifen-induced  apoptosis  by  preventing 
tamoxifen-ER  interactions.36  Overexpression  of 
erbB2  in  MCF-7  breast  cancer  cells  prevents 
tamoxifen-induced  apoptosis,  apparently  by  up- 
regulating  the  antiapoptotic  bcl-2  and  bcl-xl.  pro- 
teins.32  erbB2  signalling  via  MAP  kinase  activation 
has  also  been  proposed  as  a  mechanism  for 
tamoxifen  resistance  in  breast  cancer  cells. 38 

The  ability  of  erbB2  to  induce  Akt-mediated 
NF-kB  signalling  to  promote  cell  survival  impli¬ 
cates  this  pathway  as  another  mechanism  for  anti¬ 
oestrogen  resistance.  Recently,  indirect  evidence 
in  support  of  this  hypothesis  was  obtained  using 
gene-expression  microarray  analysis  of  anti¬ 
oestrogen  resistant  cells,  and  inaeased  NF-kB 
expression  in  cells  surviving  treatment  with  the 
anti-oestrogen  Faslodex™  (fulvestrant)  was 
reported.4o  Furthermore,  ariti-oestrogen-resistant 
cells  have  up-regulated  NF-kB  transcriptional  acti¬ 
vation,  marked  by  a  lack  of  anti-oestrogen  regula¬ 
tion  of  this  aaivation,  and  are  more  sensitive  to 
inhibition  of  NF-kB  activity  by  the  small-mole¬ 
cule  inhibitor  parthenolide.4o  Since  NF-kB  is 
downstream  of  both  erbB2  and  Akt,  some 
tumours  may  become  resistant  without  increased 
aaivity  of  either  upstream  component  (fig.  1). 
Thus,  cellular  context  may  be  a  key  determinant 
in  how  erbB2  signals  in  anti-oestrogen-resistant 
breast  cancers.  It  remains  to  be  seen  whether 
including  measurements  of  erbB2,  Akt  and  NF-kB 
will  improve  the  ability  to  predict  anti-oestrogen 
responsiveness  in  breast  cancer. 

Several  clinical  studies  have  also  shown  that 
tumours  overexpressing  erbB2  exhibit  a  decreased 
response  to  tamoxifen  when  compared  to  tumours 
without  erbB2  overexpression.41-43  However,  some 
studies  are  difficult  to  interpret  because  erbB2 
overexpression  is  often  associated  with  ER  nega¬ 
tivity  and  ER-negative  tumours  rarely  respond  to 

anti-oestrogens.35 
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Figure  1 .  eThB2  signalling  pathways  and  their  implication  in  confeiring  resistance  to  antineoplastic  drugs. 
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The  combination  of  tamoxifen  and  Herceptin™  is 
more  effective  in  treating  erbB2-positive  tumours 
than  either  agent  alone.^^  While  this  is  pharmaco¬ 
logical  antagonism,^ 5  blocking  erbB2  activity  dur¬ 
ing  the  administration  of  tamoxifen  could  still 
provide  clinical  benefit. 

erbBl  and  resistance  to  toxoids 

Data  from  several  in  vitro  studies  indicate  that 
erbB2  may  contribute  to  paclitaxel  resistance  in 
breast  and  head  and  neck  cancer  cells.^®-^®  The 
opposite  effect  was  reported  in  one  study  in  ovar¬ 
ian  cancer  cells,  where  increased  erbB2  expression 
correlated  with  inaeased  sensitivity  to  pacli- 
taxel.45  Whetlier  this  latter  result  truly  reflects  a 
different  relationship  between  erbB2  expression 
and  paclitaxel  resistance  in  ovarian  tumours,  in 
comparison  to  other  solid  tumours,  requires  fur¬ 
ther  study. 

Two  mechanisms  of  paclitaxel  resistance  have 
been  attributed  to  erbB2  signalling,  other  than 
simply  blocking  direct  signalling  to  apoptosis. 
First,  erbB2  disrupts  paditaxel-induced  cell  cycle 
arrest  at  the  G2/M  checkpoint,  an  effect  that  nor¬ 
mally  leads  to  apoptosis  through  the  involvement 
of  the  serine-threonine  kinase  p34‘^^‘^2  so  erbB2 
signalling  has  been  associated  with  inaeased 


expression  of  p2lcipi,  which  inhibits  the  fimction 
of  p34c<ic2  and  allows  cells  to  bypass  the  G2/M 
checkpoint  and  avoid  paditaxel-induced  apopto- 
sis.51-52  Second,  paditaxel  disrupts  the  cell  cycle  by 
interfering  with  the  miaotubule  flinrtions  associ¬ 
ated  with  mitosis.  N1H-3T3  cells  genetically  engi¬ 
neered  to  overexpress  erbB2  show  alterations  in 
P-tubuIin  isotype  expression  patterns  associated 
with  paclitaxel  resistance.-^-^ 

Currently,  combinations  of  taxoids  with  anti- 
erbB2  therapeutics  are  being  tested  in  the  clinical 
setting.  Data  available  from  phase  II  trials  in 
breast  cancer  patients,  in  whom  the  use  of 
Herceptin™  is  appropriate,  indicate  that  a  combi¬ 
nation  of  Herceptin™  with  either  paclitaxel  or 
docetaxei  is  both  effeaive  and  tolerable.s^-se 

While  these  early  studies  suggest  tliat  combina¬ 
tions  of  Herceptin™  and  taxoids  are  potentially 
more  effeaive  than  other  drug  combinations,  fur¬ 
ther  study  is  needed  before  firm  conclusions  can 

be  drawn. 

Anthracydine  responsiveness  and 
erbB2  expression 

Like  the  taxoids,  the  anthracylines,  particularly 
Adriamycin™  (doxorubicin),  are  among  the  most 
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effective  single  agents  for  breast  cancer  and  are 
commonly  used  in  combination  first-line  therapy. 
In  contrast  to  the  taxoids,  high  levels  of  erbB2 
expression  in  breast  tumours  generally  correlate 
with  increased  response  to  anthracycline-based 
regimens,59-62  although  exceptions  have  been 

noted.^3,64 

Anthracylines  inhibit  topoisomerase  II  aaivity 
and  the  erbB2  and  topoisomerase  Ila  genes, 
which  are  located  near  each  other  on  chromo¬ 
some  17,  are  often  co-expressed.  A  funaional 
interaction  between  erbB2  and  topoisomerase  Ila 
has  been  reported,  where  erbB2  increases  topoi¬ 
somerase  II  activity.53  Thus,  some  tumours  that 
overexpress  erbB2,  independent  of  gene  amplifi¬ 
cation,  may  also  express  sufficient  levels  of  topoi¬ 
somerase  II  to  exhibit  increased  sensitivity  to 
anthracyclines.  Some  investigators  have  suggested 
that  overexpression  of  erbB2  might  serve  as  a 
biomarker  for  predicting  anthracycline  respon¬ 
siveness  in  patients,  but  this  requires  further 
study.<55 

crbB2  and  EGFR 

The  erbB2  signalling  network  is  an  attractive 
molecular  target  in  breast  cancer,  especially  in  ER- 
negative  disease.  Although  inhibition  of  erbB2 
results  in  tumour  regression  in  a  cohort  of  patients 
with  metastatic  disease,  it  is  less  clear  whether  tar¬ 
geting  other  receptors  in  this  signalling  network 
would  be  of  therapeutic  benefit. 

Aberrant  EGFR  and  erbB2  signalling  has  been 
causally  associated  with  enhanced  breast  cancer 
cell  proliferation  and  shorter  survival  in  patients 

with  mammary  carcinomas. Also,  studies 

with  breast  cancer  cell  lines  and  human  tumours 
have  demonstrated  constitutive  phosphorylation 
of  erbB2.68'69  Tlie  reasons  for  this  constitutive 
activation  are  not  clear  but  one  possibility 
includes  co-expression  of  ligand-aaivated  EGFR 
resulting  in  transactivation  of  the  erbB2  tyrosine 
kinase.  Indeed,  in  cells  that  co-express  erbB2, 
ligand-aaivated  EGFR  preferentially  recruits 
erbB2  into  a  heterodimeric  complex  that  exhibits 
an  increased  rate  of  recycling,  stability,  and  sig¬ 
nalling  potency  compared  to  EGFR  homo¬ 
dimers. 

The  recent  work  by  Shou  et  al.  highlights  the 
important  interaction  between  EGFR  and  erbB2.72 
The  authors  examined  the  cross-talk  between  the 
ER  and  the  EGFR/erbB2-receptor  family  by  exam¬ 
ining  the  ER-positive  (MCF-7)  and  tamoxifen- 
resistant  erbB2-overexpressing  (I-IER2-18)  breast 
cancer  cell  lines.  In  both  cell  lines,  ZD  1839 
(‘Iressa’)  inhibited  ER  EGFR  and  erbB2  phos¬ 
phorylation  induced  by  epidermal  growtlt  factor 
and  heregulin,  but  not  that  by  oestrogen. 


Interestingly,  in  the  tamoxifen-resistant  HER2-18 
cells,  ZD  1839  completely  inhibited  both  oestro¬ 
gen-  and  tamoxifen-induced  phosphorylation  and 
aaivation  of  erbB2. 

Following  on  from  this  observation,  long-term 
studies  to  investigate  whether  ZD  1839  treatment 
can  delay  or  prevent  development  of  resistance  to 
various  endocrine  therapies  are  in  progress. 

Another  intriguing  finding  from  studying  breast 
cancer  cells  is  that  high  expression  levels  of 
erbB2,  even  in  the  presence  of  a  low  number  of 
EGFR,  are  exquisitely  sensitive  to  ZD  1839.  Using 
a  panel  of  breast  cancer  cell  lines  representative  of 
the  entire  spearum  of  EGFR  and  erbB2  alter¬ 
ations  found  in  breast  cancer  patients,  Campiglio 
et  al.73  showed  that  growth  inhibition  of  these 
cell  lines  demonstrates  tliat  sensitivity  to  ZD  1839 
did  not  depend  on  the  level  of  EGFR  expression. 

If  receptor  cooperativity  is  in  faa  operational  in 
breast  cancers,  interruption  of  EGFR  ftinaion 
with  EGFR-specific  tyrosine  kinase  inhibitors, 
such  as  ZD  1839  may  disrupt  EGFR-erbB2  aoss- 
talk  and  result  in  erbB2  inadh^tion  as  well.  This 
inadivation  of  erbB2  through  the  inhibition  of 
EGFR  may  also  inaease  the  antitumour  effed  of 
Herceptin'^^^ 

Summary 

The  association  between  erbB2  expression  and  sen¬ 
sitivity  of  the  tumour  I  cancer  cells  to  chemotherapy 
has  been  widely  studied^  particularly  in  breast  can¬ 
cer  Currently  data  suggest  that  high  levels  of 
erbB2  confer  some  degree  of  resistance  to  toxoids 
but  may  sensitise  cells  to  anthracyclines.  Adjuvant 
chemotherapy  regimens  using  other  cytotoxic  drugs 
(e.g.  methotrexate,  5-fluorouracil,  cisplatin)  have 
also  demonstrated  reduced  efficacy  in  high  erbB2 
expressing  tumours  in  comparuon  to  those  tvith  low 
erbB2  expression.^^-'^^  However,  others  have  found 
no  link  between  erbBl  expression  and  response  to 
adjuvant  chemotherapy  in  some  breast  tumours 
More  recently,  investigators  have  begun  to  study 
combinations  of  Herceptin™  tvith  other  chemo¬ 
therapeutic  agents,  such  as  gemcitabine  in  breast 
cancer^^  and  estramustine  in  prostate  cancer.^^ 
Several  such  combinations  show  promise  and  may 
become  more  widely  accepted  in  the  near  future. 

The  association  between  erbB2  expression  and  poor 
prognosis  is  well  established,  making  erbB2  a  use¬ 
ful  prognostic  marker  in  some  cancers.  Evidence 
clearly  suggests  that  overexpression  of  erbB2  may 
also  be  a  usejul  predictor  of  responsiveness  to  spe¬ 
cific  drugs  in  some  tumours.  However,  the  complex¬ 
ity  of  cell  signalling  and  the  importance  of  cellular 
context  may  dictate  thatsolely  measuring  erbB2 
expression  may  not  be  sufficiently  discriminative.  It 
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may  he  necessary  to  identify  a  series  of  additional 
genes  whichf  when  information  on  their  expression 
patterns  are  combined,  will  produce  a  sufficiently 
powerful  predictor  of  drug-specific  responsiveness. 

The  mechanisms  by  which  erbBl  expression  affects 
drug  responsiveness  may  be  more  complex  than  cur- 
rently  appreciated  and  may  require  a  better  under¬ 
standing  of  cellular  context  and  the  factors  that 
affea  erbBl  signalling  to  proliferation/  cell  sur¬ 
vival  However,  gene-expression  microarray  and 
proteomic  technologies  have  the  power  to  better 
define  cellular  context  and  identify  genes/  proteins 
that  can  modify  erbBl  sigimlUng.  Some  of  these 
genes  may  even  provide  new  targets  for  drug  deifel- 
opment.  Since  erbBl  sensitises  cells  to  some  drugs 
while  conferring  resistance  to  others,  measuring 
erbBl  expression  and  those  genes  that  affect  its 
downstream  signalling  could  enable  the  targeting  of 
specific  therapies  to  individual  tumours. 
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SUMMARY 

We  propose  a  block  principal  component  analysis  method  for  extracting  information  from  a  database 
with  a  large  number  of  variables  and  a  relatively  small  number  of  subjects,  such  as  a  microarray  gene 
expression  database.  This  new  procedure  has  the  advantage  of  computational  simplicity,  and  theory  and 
numerical  results  demonstrate  it  to  be  as  efficient  as  the  ordinary  principal  component  analysis  when 
used  for  dimension  reduction,  variable  selection  and  data  visualization  and  classification.  The  method  is 
illustrated  with  the  well-known  National  Cancer  Institute  database  of  60  human  cancer  cell  lines  data 
(NCI60)  of  gene  microarray  expressions,  in  the  context  of  classification  of  cancer  cell  lines.  Copyright 
©  2002  John  Wiley  &  Sons,  Ltd. 

KEY  WORDS:  principal  component  analysis;  grouping  of  variables;  similarity;  gene  expression; 
microarray  data  analysis 


1.  INTRODUCTION 

Principal  component  analysis  is  one  of  the  most  common  techniques  of  exploratory  mul¬ 
tivariate  data  analysis.  It  is  a  method  of  transforming  a  set  of  p  correlated  variables  x  = 
to  a  set  of  p  uncorrelated  variables  y  —  that  are  linear  func¬ 

tions  of  the  x’s,  referred  to  as  p  principal  components  of  x,  such  that  the  variances  of  the 
y's  are  in  descending  order  with  respect  to  the  variation  among  the  x’s.  Usually  the  first 
several  components  explain  most  of  the  variation  among  the  x’s.  In  addition  to  many  other 
applications,  principal  component  analysis  has  been  shown  to  be  a  useful  tool  in  reducing 
data  dimension  and  extracting  information,  in  seeking  important  regressors  in  regression  anal¬ 
ysis,  and  in  effectively  visualizing  and  clustering  subjects,  when  measurements  on  a  large 
number  of  variables  are  collected  from  each  subject,  llie  book  by  Jolliffe  [1]  provides  excel¬ 
lent  reading  on  this  topic,  although  other  textbooks  on  multivariate  data  analysis  do  also  (for 
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example,  references  [2]  and  [3]).  Recently,  principal  component  analysis  has  found  application 
in  the  analysis  of  microarray  gene  expressions  [4],  a  growing  technology  in  human  genome 
studies  [5,6]. 

When  dealing  with  an  extremely  large  number  of  variables  (for  example,  500  or  more), 
deriving  principal  components  can  be  computationally  intensive,  since  it  involves  finding  the 
eigenvectors  (and  eigenvalues)  of  a  matrix  with  large  dimensions.  Moreover,  a  linear  com¬ 
bination  of  such  a  large  number  of  variables  becomes  less  meaningful  to  the  investigators 
since  the  high  dimensionality  makes  it  hard  to  extract  useful  information  and  to  interpret 
the  combination.  In  one  microarray  technology,  cDNA  clone  inserts  are  printed  onto  a  glass 
slide  and  then  hybridized  to  two  differentially  fluorescently  labelled  probes.  The  final  gene 
expression  profile  contains  fluorescent  intensities  and  ratio  information  of  many  hundreds  or 
thousands  of  genes.  If  one  intends  to  apply  principal  component  analysis  directly  to  extract 
gene  expression  information  for  these  genes  from  a  certain  group  of  subjects,  then  one  has 
to  deal  with  a  matrix  with  huge  dimensions. 

In  dealing  with  such  high  dimensional  data,  we  propose  to  perform  the  principal  component 
analysis  in  a  ^stratified'"  way.  We  first  group  the  original  variables  into  several  '‘blocks^  of 
variables,  in  the  sense  that  each  block  contains  variables  (genes  in  the  microarray  experiments) 
that  are  similar;  variables  from  the  same  block  are  more  correlated  than  variables  from  different 
blocks.  We  then  perform  principal  component  analysis  within  each  block  and  obtain  a  small 
number  of  variance-dominating  principal  components.  Combining  these  principal  components 
obtained  from  each  block  forms  a  new  database  from  which  we  can  then  extract  information 
by  performing  a  new  principal  component  analysis.  We  term  this  procedure  as  '‘block  principal 
component  analysis'.  Dominating  principal  components  obtained  from  the  final  stage  can  then 
be  used  in  various  data  exploratory  analyses  such  as  clustering  and  visualization. 

The  proposed  ‘block  principal  component  analysis’  method  also  enables  us  to  reduce  the 
number  of  variables  effectively.  Within  each  block,  when  principal  component  analysis  is 
conducted  and  dominating  linear  combinations  of  variables  are  examined,  only  those  variables 
that  have  relatively  large  coefficients  are  retained.  We  will  examine  this  variable  selection 
procedure  in  detail  using  the  gene  microarray  example. 

After  a  brief  review  of  the  mathematical  derivation  of  principal  components  and  their 
applications  in  Section  2,  we  introduce  in  Section  3  the  method  of  ‘block  principal  component 
analysis’.  In  Section  4,  we  investigate  the  efficiency  of  block  principal  components  in  the 
reduction  of  data  dimension  with  respect  to  the  amount  of  variance  explained.  It  is  shown  that 
the  proposed  procedure  can  be  as  efficient  as  the  ordinary  principal  component  analysis.  We 
then  discuss  the  selection  of  informative  variables  using  block  principal  component  analysis. 
In  Section  5  we  apply  the  method  to  the  problem  of  classification  of  microarray  data  from 
the  well-known  National  Cancer  Institute  database  of  60  human  cancer  cell  lines  (NCI60), 
each  of  which  has  gene  microarray  expression  of  more  than  1000  genes  [7],  Some  discussion 
is  given  in  Section  6. 


2.  PRINCIPAL  COMPONENTS 

We  start  with  a  brief  mathematical  derivation  of  principal  components.  More  details  can  be 
found  in  references  [1]  or  [2]  and  [3].  Throughout,  vectors  are  viewed  as  column  vectors,  and 
A'  is  the  transpose  of  a  matrix  A. 
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Consider  a  /j-variate  random  vector  X  with  mean  vector  fi  and  positive  definite  covariance 
matrix  2.  Let  ■  •^p(>0)  he  the  eigenvalues  of  S  and  let  W  =  (wi , . . . , w^, )  be  a  pxp 

orthogonal  matrix  such  that 


W'EW  =  A  =  diagai,...,Ap)  (1) 

so  that  W;  is  an  eigenvector  of  £  corresponding  to  the  eigenvalue  A,-.  Now  put  U  =  W'X  = 
then  cov(U)  =  A,  so  that  Ui,...,Up  are  all  uncorrelated,  and  var({7,)  =  A;, 
i  =  l,...,p.  The  linear  components  Ui,...,Up  are  called  principal  components  of  X.  The  first 
principal  component  is  {/i=wJX  and  its  variance  is  Ai;  the  second  principal  component  is 
U2  =  v/2X  with  variance  A2;  and  so  on.  These  p  principal  components  have  the  following  key 
property.  The  first  principal  component  U\  is  the  normalized  (imit  length)  linear  combination 
of  the  components  of  X  with  the  largest  variance,  and  its  maximum  variance  is  Ai;  then  out 
of  all  normalized  linear  combinations  of  the  components  of  X  which  are  uncorrelated  with  {/] , 
the  second  principal  component  U2  has  maximum  variance  A2.  In  general,  the  A:th  principal 
component  L4  has  maximum  variance  A*,  among  all  normalized  linear  combinations  of  the 
components  of  X  which  are  uncorrelated  with  t/i,..., C4_i. 

Very  often  these  principal  components  are  referred  to  as  population  principal  components. 
In  practice  £  is  not  known  and  has  to  be  estimated  from  the  sample,  yielding  the  sample 
principal  components.  We  do  not  distinguish  these  two  definitions  here. 

Once  the  p  principal  components  are  derived,  then  we  can  conduct  various  statistical  anal¬ 
yses  using  only  the  first  q{<  p)  principal  components  which  account  for  most  of  the  variance 
of  X.  For  example,  we  can  plot  the  first  two  (three)  principal  components  in  a  two-  (three-) 
dimensional  space  to  seek  interesting  patterns  among  the  data,  or  perform  clustering  analysis 
on  subjects  in  order  to  search  for  clusters  among  the  data.  We  can  also  use  these  leading 
principal  components  as  regressors  in  a  regression  analysis  to  find  prognostic  factors  for  clin¬ 
ical  outcomes  (for  example,  drug  response  or  resistance).  See  reference  [1]  for  various  other 
applications  of  principal  component  analysis. 

Derivation  of  principal  components  involves  computation  of  eigenvalues  and  eigenvectors 
of  the  pxp  matrix  £  (or  its  sample  estimate).  When  p  is  very  large,  the  computation  will 
become  extremely  extensive.  Moreover,  it  is  always  the  interest  of  the  investigators  to  examine 
the  first  several  leading  principal  components  in  order  to  find  useful  information.  With  a  linear 
combination  of  a  large  number  of  variables,  this  becomes  extremely  difficult  and  results  are 
hard  to  interpret.  To  deal  with  these  problems,  we  develop  the  ‘block  principal  component 
analysis’  method  to  be  discussed  in  the  following  sections. 


3.  BLOCK  PRINCIPAL  COMPONENT  ANALYSIS 

Ordinary  principal  component  analysis  needs  to  find  an  orthogonal  matrix  W  such  that  W'SW 
is  diagonal.  In  a  very  extreme  case  when  all  of  the  components  of  X  are  independent,  the  p 
principal  components  are  the  p  components  of  X,  and  W  is  merely  some  permutation  of  the 
identity  matrix,  rearranging  the  components  of  X  according  to  their  variances.  If  the  random 
vector  X  can  be  partitioned  into  k  uncorrelated  random  subvectors,  so  that  £  has  diagonal 
blocks,  then  performing  principal  component  analysis  with  X  is  equivalent  to  performing  prin¬ 
cipal  component  analysis  with  each  subvector  and  then  combining  all  the  principal  components 
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from  all  subvectors.  This  simple  fact  leads  to  the  consideration  of  ‘block  principal  component 
analysis’  even  when  X  does  not  have  uncorrelated  partitions. 

Let  X  be  partitioned  as  X  =  (XJ,...,X[)'  with  X,-  being  p, -dimensional,  where  p\-\ - h 


pic  =  p,  and  L  be  partitioned  accordingly  as 

f'Lw  2i2  • 

E  = 

\^Tik\  2a:2  • 

■  ■  ^kk  j 

Let  /=  1,...,A:,  be  an  p,xp,-  orthogonal  matrix  such  that 

=  A/  =  diag(A,i , . . . ,  Xip. ),  A/I  >  •  •  •  >  A,p,  (3 ) 

so  that  Wy,  j  Pi,  is  an  eigenvector  of  E,-,-  corresponding  to  the  eigenvalue  A/,.  Put 

U/  =  \^'X/  =  ((/n,...,  C/y,  y,  then  the  p,-  components  Uy,  7  =  l,...,p/,  of  U,-  define  the  p, 
principal  components,  referred  to  as  ‘block’  principal  components,  with  respect  to  the  random 
vector  X/,  the  /th  block  of  variables  of  X. 

Now  define 


Q  =  diag(W,,...,Wi) 


(4) 


also  an  orthogonal  matrix,  and 


Y=Q'X  =  (U(,...,Uiy 


(5) 


a  random  vector  combining  all  ‘block’  principal  components,  then 


(  A,  W/EnWz 


cov(7)  =  fi  =  Q'EQ  = 


VW/E/fc,W,  W/Et2W2 


W/EuWi  \ 

A/fc  } 


(6) 


Note  that  ft  and  E  have  the  same  eigenvalues,  and  in  particular,  tr(ft)  =  tr(E),  where  tr  stands 
for  the  trace  (sum  of  all  diagonal  elements)  of  a  matrix.  Hence  X  and  Y  have  equal  total 
variance.  Let  W  be  defined  as  in  (1),  and 


R  =  Q'W  (7) 

then  R  is  also  an  orthogonal  matrix  and 

R'  cov(7)R  =  W'EW  =  diag(A,, . . . ,  Ap)  (8) 

that  is,  the  p  principal  components  of  Y  are  identical  to  those  of  X. 

Hence,  we  can  obtain  the  principal  components  of  a  random  vector  X  by  two  steps.  In 
the  first  step,  we  group  the  variables  in  X  into  several  blocks,  and  then  derive  principal 
components  for  each  block  of  variables.  In  the  second  step,  we  define  a  new  random  vector 
Y  by  combining  all  the  ‘block’  principal  components  and  then  obtain  the  principal  components 
of  Y,  which  are  identical  to  the  principal  components  of  X. 
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The  geometrical  interpretation  of  block  principal  component  analysis  is  quite  clear.  The 
/7-dimensional  random  vector  X  represents  the  p  axes  in  a  /7-dimensional  space.  The  p 
principal  components  rotate  the  X-space  to  one  whose  axes  are  defined  by  the  p  principal 
•  components.  In  order  to  rotate  the  original  space  to  its  desired  direction,  we  can  first  group 
the  axis  and  rotate  the  axis  within  each  group  and  then  do  an  overall  rotation  to  achieve  the 
desired  direction. 

From  the  mathematical  derivation  above,  we  notice  that  this  procedure  always  yields  the 
principal  components  of  X,  regardless  of  how  the  blocks  are  defined.  The  choice  of  blocks, 
however,  does  have  effects  on  several  aspects.  First,  if  the  X  can  be  divided  into  uncorrelated 
blocks,  then  the  components  in  Y  are  the  principal  components  of  X,  and  there  is  no  need 
to  orthogonalize  Y.  Second,  even  when  X  cannot  be  partitioned  into  uncorrelated  blocks,  if 
the  off-diagonal  terms  are  relatively  small,  as  measured,  say,  by  a  matrix  norm  (for 

example,  squared  sum  of  squares  of  all  elements),  then  without  losing  much  information, 
we  can  still  use  the  components  of  Y  as  approximation  to  the  principal  components  of  X. 
Third,  when  reduction  in  dimension  and  in  Ae  number  of  variables  is  conducted  within  each 
block,  which  will  be  discussed  in  the  next  section,  we  would  expect  that  variables  within  each 
block  are  much  more  correlated  than  variables  from  two  different  blocks,  so  that  selection 
of  dimensions  and  of  variables  from  one  block  will  not  be  much  affected  by  selection  of 
variables  from  another  block.  For  these  reasons,  we  recommend  grouping  the  variables  into 
blocks  according  to  their  correlation.  This  can  be  achieved  by  clustering  the  variables  using 
a  proper  function  of  Pearson’s  correlation  coefficient  as  the  measure  of  similarity  between 
variables;  one  such  measure  is  given  in  Section  5  of  the  paper. 

4.  DIMENSION  REDUCTION  AND  VARIABLE  SELECTION 
4.1.  Dimension  reduction 

A  major  application  of  principal  component  analysis  is  to  reduce  data  dimension  so  that 
the  data  structures  can  be  explored  or  even  visualized  in  a  low-dimensional  space.  When 
data  dimension  is  extremely  high,  block  principal  component  analysis  allows  us  to  reduce 
data  dimension  more  effectively  without  losing  much  information.  We  propose  the  following 
procedure  to  achieve  low  dimension.  Suppose  k  blocks,  X,-,  with  dimension  pi  and  covariance 
matrix  E,-,,  /=  1,...,A:,  of  the  original  variables  X,  are  determined  according  to  the  correlation 
between  variables.  For  each  block  X,-  we  derive  the  p,-  principal  components,  and  retain 
only  the  first  qt  (<  Pi)  principal  components,  say,  Uy,  j=l,...,qi,  so  that  the  total  variance 
explained  by  Aese  qt  principal  components  is  7r,  tr(Lft),  where  0<7r,  <l.  Now  define 

Y=(C/„,...,t/i,„...,t4,,...,t4,,)'  (9) 

a  variable  combining  all  principal  components  selected  from  each  block.  We  then  obtain  the 
principal  components  of  Y,  and  choose  the  first  /  principal  components,  say  Zi,...,Z/,  which 
explain  a  high  percentage  of  lOOff  per  cent  (for  example,  ft  =  95  per  cent)  of  the  total  variance 
of  Y.  Data  visualization  and  classification  with  the  original  variable  X  is  then  conducted  based 
on  these  /  principal  components. 

These  block  principal  components  preserve  many  optimal  properties  of  the  ordinary  prin¬ 
cipal  components;  (i)  Z],...,Z/  are  uncorrelated;  and  (ii)  var(Zi)>- •  •>var(Z/).  However, 
these  variances  are  no  longer  the  eigenvalues  of  2,  the  covariance  matrix  of  the  original 
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variables  X,  Instead,  they  are  the  eigenvalues  of  the  covariance  matrix  of  Y;  (iii)  the  total 
variance  of  Zi , . . . ,  Z/  is 


^tr[cov(Y)]  =  7r^ 


Evar(C4.) 


=  7rE7c,tr(i://) 


i=\ 


which  accounts  for  IOOtt  per  cent  of  the  total  variance  of  X,  where 


tr(L) 


EU  tr(S«) 


We  hence  have 


n^nmm{ni}  (10) 

When  using  principal  components  to  explore  (for  example,  cluster,  visualize)  the  data,  we 
expect  that  the  leading  components  explain  most  of  the  variance  so  that  they  will  reveal  the 
true  nature  of  the  data  structure;  (10)  asserts  that  block  principal  components  Z|,...,Z/  will 
retain  most  of  the  variance  if,  within  each  block  and  for  the  final  principal  component  analysis, 
the  selected  principal  components  explain  most  of  the  variance.  For  example,  if  7i:,  >95  per 
cent,  i=l,...,k  and  ft >95  per  cent,  then  7t>90  per  cent. 


4.2.  Variable  selection 

When  the  number  p  of  variables  is  very  large,  many  variables  can  be  highly  correlated  with 
each  other  and  some  may  become  redundant  when  the  rest  are  being  used  to  explore  date 
structure.  For  example,  in  a  gene  microarray  experiment  where  gene  expression  of  a  large 
number  of  genes  is  obtained  for  a  number  of  tissues,  tissue  classification  based  on  all  genes 
may  be  quite  similar  to  that  based  on  a  small  group  of  genes.  If  this  is  the  case,  then 
with  respect  to  tissue  classification,  only  these  genes  are  informative  and  the  rest  become 
redundant,  assuming  that  using  all  the  genes  indeed  captures  the  real  structure  of  the  data.  It 
is  therefore  important  to  select  variables  that  contain  almost  all  information,  with  respect  to 
certain  statistical  properties,  that  all  variables  would  contain. 

Block  principal  component  analysis  can  be  used  to  select  these  variables.  We  propose  the 
following  two  steps: 

Step  1.  Divide  the  original  variable  X  into  k  blocks,  X,-,  i=\,...,k,  according  to  correlation 
between  variables. 

Step  2.  For  each  block  X,-,  conduct  principal  component  analysis  and  select  the  first  leading 
principal  components  such  that  the  total  variance  of  X,-  is  explained  by  a  satisfactory 
amount,  say,  at  least  95  per  cent.  Examine  the  coefficients  (or  loadings  in  many 
principal  component  analysis  literatures)  of  the  variables  in  X,-  in  these  qi  leading 
components  and  retain  only  those  variables  with  large  coefficients.  Combine  all  the 
variables  selected  from  each  block  and  then  use  only  these  variables  for  further 
analysis. 

A  third  step  may  also  be  useful  if  the  number  of  variables  selected  is  still  too  large: 

Step  3.  Conduct  principal  component  analysis  again,  but  based  only  on  the  variables  selected 
in  step  2.  Select  the  first  several  leading  principal  components  to  explain  most  of  the 
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variance.  Then  examine  the  variables  again  and  retain  those  with  large  coefficients  in 
these  leading  combinations. 

There  is  no  universal  criterion  for  how  many  and  which  variables  should  be  selected  from 
the  leading  principal  components.  Jolliffe  [1]  recommended  choosing  a  variable  from  each 
leading  principal  component  with  the  largest  absolute  coefficient,  if  the  variable  has  not  been 
selected  from  previous  leading  components.  In  practice  some  modifications  of  Jolliffe’s  proce¬ 
dure  may  also  be  effective.  For  example,  one  can  choose  several  variables  from  each  leading 
principal  component  with  the  largest  absolute  coefficients.  For  more  discussion,  see  refer¬ 
ence  [1]. 

In  the  next  section,  we  demonstrate  the  block  principal  component  analysis  method  us¬ 
ing  the  well-known  NCI60  human  cancer  cell-line  data  [7]  to  select  a  group  of  genes  to 
visualize/cluster  the  cell  lines.  The  result  shows  such  selection  to  be  quite  effective. 


5.  APPLICATION  TO  GENE  MICROARRAY  ANALYSIS:  AN  EXAMPLE 

The  NCI60  database  contains  expression  of  more  than  9000  genes  of  60  human  cancer  cell 
lines  from  nine  types  of  cancer  including  colorectal,  renal,  ovarian,  breast,  prostate,  lung  and 
central  nervous  system,  as  well  as  leukaemia  and  melanomas.  Gene  expression  levels  are 
expressed  as  -log(ratio),  where  ratio  =  the  red/green  fluorescence  ratio  after  computational 
balancing  of  the  two  channels.  Readers  are  referred  to  reference  [7]  for  more  details.  The 
data  have  been  made  public  for  analysis  on  the  authors’  web  site  http://discover.nci.nih.gov. 
To  get  familiar  with  the  DNA  microarray  technology,  readers  are  referred  to  references  [5] 
and  [6]  for  more  information. 

One  of  the  objectives  of  this  study  is  to  explore  the  relationship  between  gene  profiles 
and  cancer  phenotypes.  Scherf  et  al  [7]  used  a  clustering  analysis  method  to  study  the 
relationship.  They  provide  the  clustering  tree  of  the  60  cell  lines,  based  on  1376  genes,  and 
showed  that  most  of  the  cell  lines  cluster  together  according  to  their  phenotypes  (see  Figure  2a 
of  reference  [7].)  One  important  question  is  whether  a  smaller  group  of  genes  can  preserve 
the  same  relationship  structure. 

We  use  a  selection  method  based  on  block  principal  component  analysis,  as  described  in 
Section  4,  to  tackle  this  issue.  For  simplicity,  we  study  only  cell  lines  from  three  types 
of  cancer,  colorectal  (7  cell  lines),  leukaemia  (6  cell  lines)  and  renal  (8  cell  lines);  each 
cell  line  has  microarray  expression  of  the  same  1416  genes.  The  data  set  of  interest,  21 
cell  lines  (being  the  subjects)  and  1416  genes  (being  the  variables),  hence  form  a  21x1416 
matrix,  representing  21  data  points  (21  rows  of  the  matrix)  in  a  1416-dimensional  data  space. 
The  complete-linkage  clustering  tree  based  on  these  1416  genes  is  shown  in  Figure  1(a). 
The  dendrogram  is  consistent  with  that  in  reference  [7],  and  shows  clearly  that  the  21  cell 
lines  cluster  according  to  their  cancer  phenotypes.  The  readers  are  reminded  that  phenotype 
information  is  not  used  in  the  clustering,  but  only  to  validate  the  clustering  results.  One 
renal  cell  line  marked  as  ‘RE8’,  which  is  farther  from  the  rest  of  renal  cell  lines,  has  been 
recognized  to  have  some  special  feature  (see  reference  [7]  for  detail). 

We  now  seek  to  determine  the  blocks  for  the  1416  genes.  Figure  2  shows  a  plot  of  semi- 
partial  versus  the  number  of  clusters  using  complete-linkage  algorithm  and  cfy  =  arcos(|/?y|) 
as  a  measure  of  dissimilarity  between  gene  i  and  gene  y,  where  pij  is  the  Pearson  correlation 
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Figure  1.  Dendrogram  of  complete  linkage  hierarchical  clustering  of  21  cell  lines:  (a)  tree  based  on 
1416  genes;  (b)  tree  based  on  200  genes.  CO  is  colorectal,  LE  is  leukaemia  and  RE  is  renal. 


coefficient.  The  semi-partial  measures  the  loss  of  homogeneity  when  two  clusters  are 
merged.  Define  SSt  as  the  corrected  total  sum  of  squares  of  all  subjects  and  summed  over 
all  variables.  For  a  certain  cluster  C,  let  SSc  be  the  corrected  total  sum  of  squares  of  all 
subjects  in  cluster  C  summed  over  all  variables.  Then  the  semi-partial  for  combining  two 
clusters  C\  and  C2  into  one  cluster  C  is  (SSc  SScj  -  SSczj/SSr.  A  large  semi-partial  R^ 
indicates  significant  decrease  in  homogeneity.  Since  subjects  within  the  same  cluster  should 
be  similar,  two  clusters  should  not  be  combined  as  one  cluster  if  the  semi-partial  R^  is  large. 
In  practice  we  determine  the  number  of  clusters  by  minimizing  the  semi-partial  R^;  a  plot 
of  the  semi-partial  R^  versus  the  number  of  clusters  is  extremely  helpful.  More  discussion 
and  computation  of  semi-partial  R^  can  be  found  in  reference  [8].  Other  statistics  can  also  be 
used  to  determine  the  number  of  clusters  in  the  data.  Milligan  and  Cooper  [9]  examined  30 
procedures  for  determining  the  number  of  clusters,  including  several  variations  based  on  sum 
of  squares. 

For  the  cancer  cell-line  microarray  data,  the  semi-partial  R}  becomes  nearly  flat  after  14 
clusters.  This  indicates  that  the  1416  genes  can  be  approximately  divided  into  14  clusters; 
further  dividing  the  data  gains  little  in  reducing  heterogeneity.  These  clusters  of  genes  deter¬ 
mine  the  blocks  within  each  of  which  principal  component  analysis  will  be  conducted.  The 
number  of  genes  in  the  blocks  ranges  from  43  to  158  (Table  I).  Principal  component  analysis 
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Figure  2.  Determining  number  of  blocks:  plot  of  semi-partial  F^. 


Table  I.  Summary  of  14  gene  blocks. 


Block 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

Number  of  genes 

107 

154 

88 

158 

68 

152 

84 

136 

44 

84 

143 

82 

73 

43 

Number  of  PCs 

14 

13 

15 

15 

14 

15 

14 

15 

14 

16 

16 

14 

14 

11 

Per  cent  variance 

95.6 

95.3 

95.2 

95.3 

95.2 

96 

95.1 

95.2 

96 

96 

95.1 

96 

95.6 

95.6 

is  conducted  within  each  block,  and  the  first  several  leading  principal  components  are  then 
selected,  resulting  in  a  total  of  200  principal  components.  For  each  block,  selected  principal 
components  explain  >95  per  cent  of  total  variance  in  that  block.  For  each  block.  Table  I 
lists  the  number  of  genes,  the  number  of  selected  principal  components  and  the  percentage  of 
total  variance  explained  by  these  leading  components. 

For  each  block,  genes  with  largest  coefficients  in  the  selected  leading  principal  components 
are  retained,  using  Jollife’s  one  variable  per  leading  component  method.  This  yields  a  total 
of  200  genes  for  further  analysis. 

The  first  three  leading  principal  components,  computed  based  on  the  1416  genes,  explain 
only  49  per  cent  of  the  total  variance.  Two-  or  three-dimensional  visualization  of  the  data 
based  on  these  principal  components  can  be  very  misleading.  We  validate  these  selected  200 
genes  by  deriving  a  hierarchical  clustering  tree  for  the  21  cell  lines  based  on  gene  expressions. 
The  dendrogram  is  shown  in  Figure  1(Z>).  It  is  remarkably  similar  to  the  one  based  on  all 
1416  genes  (Figure  1(a)).  Both  illustrate  that  cell  lines  with  the  same  phenotype  are  more 
similar  than  those  fi’om  different  phenotypes.  This  shows  that  a  much  smaller  number  of  genes 
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can  provide  the  same  insight  for  the  data  as  the  whole  set  of  genes,  and  block  principal  com¬ 
ponent  analysis  provides  an  effective  way  to  achieve  this.  Note  that  a  hierarchical  clustering 
dendrogram,  obtained  based  on  a  set  of  variables,  is  essentially  the  same  as  the  hierarchical 
clustering  dendrogram  obtained  based  on  leading  principal  components,  provided  that  these 
leading  components  explain  most  of  the  variation  among  the  variables.  The  remarkable  re¬ 
semblance  between  Figure  1(a)  and  Figure  1(b)  further  demonstrates  the  effectiveness  of  the 
block  principal  component  analysis  method,  as  compared  to  the  ordinary  principal  component 
analysis. 


6.  DISCUSSION 

In  this  paper  we  show  that  a  much  smaller  number  of  genes  can  provide  the  same  insight 
for  the  cancer  phenotypes  as  the  whole  set  of  genes.  We  demonstrate  that  block  principal 
component  analysis  is  an  effective  way  to  select  these  genes.  This  kind  of  analysis  is  ‘unsu¬ 
pervised’,  a  term  popular  in  neural  network/pattem  recognition  [10];  cancer  phenotypes  are 
used  only  to  validate  the  algorithm  and  analysis. 

Selection  of  informative  genes  in  the  microarray  setting,  and  other  settings  as  well,  is  by 
no  means  an  easy  task,  especially  when  the  analysis  is  unsupervised.  Very  likely  the  choices 
of  genes  are  not  unique;  there  might  exist  several  groups  of  genes  that  provide  the  same 
classification.  Biostatisticians  should  provide  every  potential  group  of  genes  to  the  medical 
investigators  and  hopefully  a  meaningful  group  of  genes  can  be  determined  by  combining  the 
statistical  guidance  and  biological  knowledge.  Indeed,  some  preliminary  selection  of  genes 
based  on  biological  knowledge  is  extremely  valuable,  even  before  any  statistical  analysis  is 
conducted.  It  should  be  noted,  however,  that  genes  that  are  biologically  similar/dissimilar  may 
not  be  statistically  similar  (correlated )/dissimilar  (uncorrelated). 
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